Merging Pdf's drastically increasing file size

Hi,

I’m having an issue where when trying to merge two pdfs the resulting pdf is drastically bigger than the size of the two individual pdf’s. E.g., I try merging two pdfs that are 13kb and 25kb and get a resulting merged pdf of size 65kb. Using the .net platform and the following code…

    public byte[] MergePdfs(IEnumerable<byte[]> pdfCollection)
    {
        var baseDoc = new Document();
        foreach (var byteArray in pdfCollection)
        {
            var currentDoc = byteArray.ToDocument();
            baseDoc.Pages.Add(currentDoc.Pages);
        }

        return baseDoc.ToBytes();
    }

    public static Document ToDocument(this byte[] bytes)
    {
        var byteStream = new MemoryStream(bytes);

        return new Document(byteStream);
    }

    public static byte[] ToBytes(this Document document)
    {
        var outStream = new MemoryStream();
        document.Save(outStream);

        return outStream.GetBuffer();
    }

I haven’t found a ton of documentation online related to merging byte arrays but so maybe I’m just missing something. Most seem to be using file streams but our product is hosted as a web service with an byte array input. Any help would be greatly appreciated.

I was able to get something working, however it is not ideal.

public byte[] MergePdfs(IEnumerable<byte[]> pdfCollection)
{
var count = 1;
foreach (var pdf in pdfCollection)
{
var strPath = Environment.GetFolderPath(System.Environment.SpecialFolder.DesktopDirectory) + $"\DownloadTest{count}.pdf";
File.WriteAllBytes(strPath, pdf);
count++;
}

        var firstFile = Environment.GetFolderPath(System.Environment.SpecialFolder.DesktopDirectory) + $"\\DownloadTest1.pdf";

        var basePdf = new Document(firstFile);

        for (var c = 2; c < count; c++)
        {
            var strPath = Environment.GetFolderPath(System.Environment.SpecialFolder.DesktopDirectory) + $"\\DownloadTest{c}.pdf";
            var newDoc = new Document(strPath);
            basePdf.Pages.Add(newDoc.Pages);
        }

        basePdf.Save(Environment.GetFolderPath(System.Environment.SpecialFolder.DesktopDirectory) + "\\MergedDownloadTest.pdf");

        return new byte[1];

}

Tried to get something in memory to work but had no luck. Ended up writing the file to a directory, then reading that file again into the Document object. Now instead of my 65mb file I get the expected 38mb file. However, this is still not ideal. I would still expect a solution that spits about a 38mb file given my code sample in my original post.

@justins

Would you kindly share your sample PDF documents with us for our reference. We will test the scenario in our environment and address it accordingly. Also, please try to optimize the resultant PDF if it resolves your issue.

I’m DM’d you the pdfs. I tried the doc.OptimizeResources() and it shaved off a little bit of size but no where near the amount that had been added.

@justins

We used following code snippet to test the scenario using Aspose.PDF for .NET 20.4 and output PDF file was fine in terms of its size. We have shared it for your reference in private message.

Document document1 = new Document(new MemoryStream(File.ReadAllBytes(dataDir + "Void Star - Contract.pdf")));
Document document2 = new Document(new MemoryStream(File.ReadAllBytes(dataDir + "Void Star - Title Request.pdf")));
document1.Pages.Add(document2.Pages);
document1.Save(dataDir + "FinalDoc_DOM.pdf");

Would you kindly try using latest version of the API and in case you still face any issue, please let us know by sharing a small console application which is able to reproduce the issue. We will test the scenario in our environment and address it accordingly.

I believe I have discovered my issue. In my initial code snippet, I have a method ToBytes(). Instead of return outStream.GetBuffer(); I believe I need to be doing return outStream.ToArray

Thank you for your help I will follow up if I experience any more issues

@justins

It is good to know that you have managed to resolve your issue. Please feel free to ask in case you need further information.