BUG: PdfFileEditor fails to merge three specific files

Dear Sir/Madam,

I am using PdfFileEditor to merge 3 PDFs together with the Append(string inputFile, string[] portFiles, int startPage, int EndPage, string outputFile) method.

Given one set of files, this works satisfactorily. However, on a different set of files it fails by not returning from the call to Append().

Please find my code snippet here:
private static void AppendingMerge(string targetFilename)
{
Log.Info(“Appending merge strategy.”);

        var startMerging = DateTime.Now;
        Log.Info($"{startMerging:HH:mm:ss.fffff} starting to merge using PdfFileEditor.");

        var pdfEditor = new PdfFileEditor();
        var pageFiles = new List<string>();
        for (var pageNum = 1; pageNum < 3; pageNum++)
        {
            var appendFilename = $@"C:\Temp\VW-Example\2016_06_01_Factbook_2016_Page{pageNum:000}.pdf";
            pageFiles.Add(appendFilename);                
        }

        const string finalPage = @"C:\Temp\VW-Example\2016_06_01_Factbook_2016_Page087.pdf";
        pdfEditor.Append(finalPage, pageFiles.ToArray(), 1, int.MaxValue, targetFilename);

        var stopMerging = DateTime.Now;
        var mergingDuration = stopMerging - startMerging;
        Log.Info($"{stopMerging:HH:mm:ss.fffff} merged pages using PdfFileEditor in {mergingDuration.TotalMilliseconds} ms.");

        Log.Info("Appending merge strategy end.");
    }

The files in question, both the set for which it is working and the set for which it is not have been generated by splitting a large presentation. I can provide the files in question, but have failed to find an way to attach them to this bug report.

Kind Regards
StephanPdfFileEditorFailsToMergeSamples.zip (249.8 KB)

@skuehn,

Please create a ZIP of your three problematic PDF documents. There is an upload button in the header of the post editor. In case, if the attachment exceeds 3MB size limit, then upload this file to any free file sharer server, e.g. Google Drive, Dropbox etc. and share its download link with us.

Dear Imran, thanks for the quick response.

I managed to identify the upload button in the post editor. I have uploaded the files.

Kind Regards
Stephan

@skuehn,

It is because you are passing the Max integer value as the end page number parameter. We recommend our clients to pass the realistic values. Please change the Max value, and then let us know how that goes into your environment.

Dear Imran,

thank you for the response. If I pass in a smaller value as int.MaxValue the call works (for the moment).

I am not sure what I need to specify here for startPage and endPage parameters.

Do they specify a range of documents within the filename array to merge, i.e. Append(“Page1.pdf”, [ “Page2.pdf”, “Page3.pdf”, Page4.pdf”], 1, 1, “Merged.pdf”) will merge documents Page1.pdf and Page3.pdf into Merged.pdf?

Or will the call Append(“Page1.pdf”, [ “Page2.pdf”, “Page3.pdf”, Page4.pdf”], 1, 2, “Merged.pdf”) merge the first and second page (or the second and third page) of all three documents into the target document?

I have not been able to find this out experimentally, for me, a range of 0 to 1 produces no output document, but the function returns (at least).

A range of 1 to 2 produces output document with all files merged – I cannot tell about the page ranges merged as this are single page documents.

A range of 1 to int.MaxValue crashes your function with the files provided, but not with other files.

I think you need to explain this a little bit more in your documentation.

Additionally, I would still consider it a bug if I pass in int.MaxValue into your function and the function silently dies. This somehow makes me wonder if I have observed a memory buffer overflow in your function.

Please clarify that this is not the case, as it would raise security concerns when using your components.

Kind Regards
Stephan

@skuehn,

You can retrieve the page collection or a single page instance from a PDF document, and then concatenate into the second PDF document. In order to concatenate multiple PDF documents, you can repeat this process in the code. Please refer to this help topic: Concatenate PDF Files

There is no way to track page numbers in an array of PDF file names. In order to specify a range of pages, you can import a PDF document to Document instance, add or remove page(s) before inserting into another PDF document by calling PageCollection.Add method, and do not save this instance of PDF document to local computer. Please refer to these help topics: Delete a Particular Page from the PDF File and Insert an Empty Page in a PDF File

It has been logged under the ticket ID PDFNET-44848 in our bug tracking system. We have linked your post to this ticket and will keep you informed regarding any available updates.

Dear Imran,

thanks again for your answer.

However, I am now totally confused what PdfFileEditor.Append Function does, especially what the parameters startPage and endPage do.

I am also a bit confused what the parameter portFiles is supposed to mean,

You can also record this as a bug against the documentation of the Append() function.

Kind Regards
Stephan

@skuehn,

We have recorded this information under the same ticket ID PDFNET-44848 in our issue tracking system. We will notify you once it is fixed.

Dear Imran,

thank you for your message and the support.

Meanwhile, could you please provide the documentation on before mentioned parameters?
I would also highly appreciate it if you could give me a timeline for resolution of said ticket.

If this is not possible at the moment, please advise how I can escalate this issue.

Thanks in advance

Kind Regards
Stephan

@skuehn,

Please refer to the API reference docs of Append method: PdfFileEditor.Append Method

It is difficult to share an estimate before the completion of the analysis phase. In order to escalate priority, we recommend our clients to post their critical issues (ticket Ids) in the paid support forum. Please refer to this helping link: Aspose support options

Dear Imran,

thank you for your kind reply.

My point is, that I don’t understand the documentation. Perhaps you can explain it to me in simple english.

portFiles
Type:System.String[]
Documents to copy pages from.

startPage
Type: System.Int32
Page starts in portFiles documents.

endPage
Type: System.Int32
Page ends in portFiles documents .

What is this supposed to mean?

If you cannot explain this yourself, have somebody from your engineering team contact me.

As for the paid support, I work for KPMG. We have a Site OEM License and a paid Enterprise Support
valid until 8/18/2018. I hope that is sufficient.

Kindest Regards
Stephan Kühn

@skuehn,

The port files are a list of PDF documents and we retrieve a page range from these files to merge into the input PDF document. We use start and end page indexes, e.g. 2 to 5. It means the API would retrieve page numbers 2,3,4 and 5 from each PDF document as listed in an array of port files. We hope, this helps. Please let us know in case of any further assistance or queries.

Dear Imran,

thanks for your Reply.

So I understand it correctly, startPage is the first Page of each portFile and endPage is the last Page of each portFile appended.

If I were to merge just single pages, specifying startPage and endPage both to 1 would be sufficient?

inputFile would be kept as is and specified page ranges from each portFile would be appended to that document?

If page would not exist in the specified page range, nothing would be appended?

If no page at alle would fall into the specified page range, your append method would produce corrupted PDF files?

Kind Regards
Stephan

@skuehn,

You are right. In order to merge single page PDF documents, you only need to set start and end page indexes to 1, and all the selected pages will be merged into the input PDF document. If the pages are not selected, then nothing would be appended to the input PDF document.

If there is no page in the input PDF document and the specified range of pages, then there would no page in the output PDF document and its size would be 0 KB.