Merged pages are cropped badly

Hi,

I am merging a set of PDFs, which seems to be front page followed by scanned sheets.
Probably created in Word, and added the scanned sheets. All pages are A4 size.

After merging, all the scanned pages are badly cropped, missing the top 20% or so!
It seems that the page size has been altered to nearly square size, cropping the top part.

Please see attached pic, showing the original pdf to the left, and the resulting merged page to the right. the pages should be left as they are, no scaling should be required!
Using version 10.6.0
UPDATE:
I have investigated further, and this problem ONLY arise if I insert and generate a front page, AND ONLY if I add headings, not if just add logo and text… If do not insert a front page or insert the same file into another PDF document, the pages are ok.
And if I insert the front page, it has the same size as the front page (which is correct), and I do not apply any margins or anything, just insert a logo and som text.
I think I use your sample code to generate the front page by the way, using headers:
Aspose.Pdf.Heading heading2 = new Aspose.Pdf.Heading(1);
Aspose.Pdf.Text.TextSegment segment2 = new Aspose.Pdf.Text.TextSegment();
heading2.TocPage = tocPage;
heading2.Segments.Add(segment2);

// Specify the destination page for heading object
heading2.DestinationPage = mainpdf.Pages[pagenum];

// Destination coordinate
segment2.Text = “MyHeader”;
tocPage.Paragraphs.Add(heading2);


Hi Ole Martin,

Thanks for contacting support.

Can you please share the resource input PDF files which you are trying to concatenate, so that we can test the scenario in our environment. We are sorry for this inconvenience.

Hi,

this is a customer document, so I can’t post it here. Can I send you the code + file by mail? I can use the email function in your contact menu I guess?

Hi,

I have simplified my code, and investigated the source file. Seems it must be some kind of errors in the customer pdf. It consists of a front page which is converted directly from Word or similar, because the content is selectable, and then some pages that are scanned full page images, followed by other pages that are also converted from text.
I have checked the properties of the pages, and the PageInfo is the same for all pages:
Height=842, Width=595, Landscape=false.
But the other rectangle properties (e.g. ArtBox, CropBox etc) all reports Height=842, Width=595 for the converted pages, and opposite for the scanned pages!
If I view the original PDF, or opens it in Aspose and add an empty front page, the scanned pages appear normal and in portrait mode.
However, if I add something to the inserted front page in Aspose with Paragraphs.Add, then the scanned pages are cropped! The top is cropped, so the pages are left almost square.

Any idea why editing the new front page should affect these pages that are not touched in any way?
ole.helgesen:
Hi,
this is a customer document, so I can't post it here. Can I send you the code + file by mail? I can use the email function in your contact menu I guess?
Hi Ole Martin,

For your convenience, I have marked this forum thread as private so that the contents shared in this thread are only accessible to Aspose Staff. In case you are still not comfortable while sharing the documents, you may consider directly sending us the document by following instructions specified over How to send a license?

ole.helgesen:
Hi,
I have simplified my code, and investigated the source file. Seems it must be some kind of errors in the customer pdf. It consists of a front page which is converted directly from Word or similar, because the content is selectable, and then some pages that are scanned full page images, followed by other pages that are also converted from text.
I have checked the properties of the pages, and the PageInfo is the same for all pages:
Height=842, Width=595, Landscape=false.
But the other rectangle properties (e.g. ArtBox, CropBox etc) all reports Height=842, Width=595 for the converted pages, and opposite for the scanned pages!
If I view the original PDF, or opens it in Aspose and add an empty front page, the scanned pages appear normal and in portrait mode.
However, if I add something to the inserted front page in Aspose with Paragraphs.Add, then the scanned pages are cropped! The top is cropped, so the pages are left almost square.

Any idea why editing the new front page should affect these pages that are not touched in any way?
Hi Ole Martin,

As you have observed that executing the same code over same documents in a slight different scenario is generating different output. So we request you to please share the input PDF documents and code snippet so that we can test the scenario in our environment.

Thanks, sent you the code yesterday, but I will upload it here as well

Hi,

everything was included in the attachment in my previous post.
You will find the complete code to provoke the issue, and the source pdf, and even the result file.
Please note what I have found out about the page size properties, and see the comment in code that explains which command is causing the issue.
If you skip add paragraph, the front page is not causing issues.
It could be because there has been errors when the pdf was created, I don’t know why the page size properties for the scanned pages are landscape but are shown as portrait orientation, or why this is rendered differently based on what you do on the front page!

Hi Ole Martin,


Thanks for sharing the details.

As requested earlier, please share some sample project and resource files, so that we can test the scenario in our environment. We are sorry for this inconvenience.

I have uploaded it earlier in a post, and sent you everything on email, so you should have it now.

It’s also attached to this post.



Hi Ole Martin,


Thanks
for sharing the resource files.
<o:p></o:p>

I
have tested the scenario and I am able to notice the same problem. For the sake
of correction, I have logged this problem as PDFNEWNET-39282 in
our issue tracking system. We will further look into the details of this
problem and will keep you updated on the status of correction. Please be
patient and spare us little time. We are sorry for this inconvenience.

Hi,

I now found a similar problem at another customer. I merge large format drawings. The inserted page is A4 it seems. If I add paragraphs to the TOC on it, the following pages inherits the same size as page 1, and are badly cropped. If do not add paragraphs, all pages are correct and original size.

Another finding related to the last issue:

If I set the PageInfo sized to the largest page size, it is no longer cropped, but then the first page is the same size as well, which is no good…

combinedDoc.PageInfo.Width = combinedDoc.Pages[2].CropBox.Width;
combinedDoc.PageInfo.Height = combinedDoc.Pages[2].CropBox.Height;
combinedDoc.Save(resultfile);

So it seems to be a big problem with mixed page sizes.
But the original problem was related to merging pdf files of same page size, where some pages were scanned images. Maybe they have a non-standard DPI?
I tried to adjust PageInfo there, but it did not affect anything, still the same problem.
Hi Ole,

ole.helgesen:
Hi,
I now found a similar problem at another customer. I merge large format drawings. The inserted page is A4 it seems. If I add paragraphs to the TOC on it, the following pages inherits the same size as page 1, and are badly cropped. If do not add paragraphs, all pages are correct and original size.

Thanks for your inquiry. I have tested the scenario with shared documents and noticed that adding TOC crops the page area larger than A4 in resultant PDF, so logged a ticket PDFNEWNET-39365 in our issue tracking system for further investigation and rectification. We will notify you as soon as it resolved.

We are sorry for the inconvenience caused.

Best Regards,
Hi Ole,

ole.helgesen:
Another finding related to the last issue:
If I set the PageInfo sized to the largest page size, it is no longer cropped, but then the first page is the same size as well, which is no good...

combinedDoc.PageInfo.Width = combinedDoc.Pages[2].CropBox.Width;
combinedDoc.PageInfo.Height = combinedDoc.Pages[2].CropBox.Height;
combinedDoc.Save(resultfile);

So it seems to be a big problem with mixed page sizes.
But the original problem was related to merging pdf files of same page size, where some pages were scanned images. Maybe they have a non-standard DPI?
I tried to adjust PageInfo there, but it did not affect anything, still the same problem.

Thanks for sharing your findings. I have also tested and shared the information with product team. They will consider it during issue investigation. We will keep you updated about the issue resolution progress.

Best Regards,

The issues you have found earlier (filed as PDFNEWNET-39282;PDFNEWNET-39365) have been fixed in Aspose.Pdf for .NET 11.1.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.