Append all content of a PDF at the end of another PDF using Aspose.PDF for .NET

Hi threre,

We are using the version 20.3 of Aspose.PDF for .NET and we happily make use of Concatenate function to merge a list of PDFs into a bigger one.

But we also need the following use case: to append all content from the doc2.pdf at the end of the last paragraph from the last page of doc1.pdf to obtain something similar to doc3.pdf. This is different than appending pages one after another as it happens with Concatenate.

Can we accomplish this? We are up for any suggestion, including one that copies a paragraph at a time for instance, if that is possible of course.

Here are the files pdfs.zip (159.6 KB).

Best regards,
Alin

@gwert

We need to investigate against your particular requirement whether it is possible to achieve using the API or not. For the purpose, an investigation ticket is logged as PDFNET-48700 in our issue tracking system. We will analyse it in details and keep you informed about status of its resolution. Please have patience and give us some time.

We are sorry for the inconvenience.

@gwert

The PDF file format doesn’t work on the level of content paragraphs and it requires to explicitly set the positioning of content elements, so breaking the content of the source file into paragraphs and copying these paragraphs into a destination file isn’t an easily accomplished task. In some cases, like for the provided documents, where the content of both files is guaranteed to fit into a page, you may use a PdfPageStamp functionality:

var inputPdf1 = "doc1.pdf";
var inputPdf2 = "doc2.pdf";
var outputPdf = "doc3.pdf";

using (Document doc1 = new Document(inputPdf1),
       doc2 = new Document(inputPdf2))
{
    // The page we're placing content to
    var destPage = doc1.Pages[1];
    // The page we're copying content from
    var srcPage = doc2.Pages[1];

    // First, we need to get the bounding rectangle of the existing content on the destPage, to place the srcPage's content below
    // TextFragmentAbsorber gets positions of textual content on the page
    var textAbsorber = new TextFragmentAbsorber();
    textAbsorber.Visit(destPage);

    // ImagePlacementAbsorber gets positions of images on the page
    var imageAbsorber = new ImagePlacementAbsorber();
    imageAbsorber.Visit(destPage);

    // Merge rectangles of all absorbed texts and images to get a single rectangle bounding existing content
    var contentRectangle = Rectangle.Empty;
    foreach (var textFragment in textAbsorber.TextFragments)
    {
        contentRectangle = contentRectangle.IsEmpty
            ? new Rectangle(textFragment.Rectangle.LLX, textFragment.Rectangle.LLY,
                textFragment.Rectangle.URX, textFragment.Rectangle.URY)
            : contentRectangle.Join(textFragment.Rectangle);
    }

    foreach (var imagePlacement in imageAbsorber.ImagePlacements)
    {
        contentRectangle = contentRectangle.IsEmpty
            ? new Rectangle(imagePlacement.Rectangle.LLX, imagePlacement.Rectangle.LLY,
                imagePlacement.Rectangle.URX, imagePlacement.Rectangle.URY)
            : contentRectangle.Join(imagePlacement.Rectangle);
    }

    // Second, create a page stamp out of the srcPage's content and push it below the existing destPage's content
    var pageStamp = new PdfPageStamp(srcPage)
    {
        // You may want to place additional margins here
        YIndent = -contentRectangle.Height
    };
    pageStamp.Put(destPage);

    doc1.Save(outputPdf);
}

Please let us know if it resolves your issue.