Removing internal page rotation without increasing the size of a PDF

Would you have a simple example of how to remove internal rotation from a PDF produced by a scanner?

C# or VB samples welcome.

A scanned PDF is basically just a bunch of images bundled into a PDF “wrapper”, but we find that sometimes the internal structure of the PDF is not what we expect from what the user sees when they open the PDF in Adobe Acrobat Reader or Foxit PDF Reader or any such software. The user may, for example, see correctly formatted A4 Pages in a Portrait orientation, but internally the PDF may have Landscape pages with images on them rotated by 270 degrees. Obviously the Page Rotation can be set to something other than “None”

oPage.Rotate <> Aspose.Pdf.Rotation.None

And the PDF reader can handle that so that it still displays those pages to the user the way they are intended to be viewed.

But we sometimes need to convert a PDF to a “normal” state so

oPage.Rotate = Aspose.Pdf.Rotation.None

and the PDF should still be shown as identical to the original, but every approach we have tried so far results in a new PDF that is larger than the original, even after optimisation - at least a level of optimisation that doesn’t result in visible deterioration.

So I feel we’re missing a trick here. Could you provide us with a sample of code to reorient a scanned PDF to remove this internal rotation without causing a significant increase in the file size?

@kidkeogh

The rotation in the PDF document is sometimes specified on Page Level and sometimes it is specified in Transformation Matrix. The API can deal with such rotations. However, if image is already rotated itself and then added to the PDF in normal routine, we are afraid that there is no way in the API in order to change its rotation. Furthermore, please share the sample file and code snippet with which you are facing size related issue after optimization.

Thanks Ali, can I send you a sample file privately?

@kidkeogh

Sure, you can send your file privately in a private message. We are sending you one and you can share your file there.

Thank you. Done

@kidkeogh

Thanks for sharing the file. This is the file in which you have reoriented the image and its size is being increased? Please share the sample code snippet as well with which we can reproduce the size increase issue and optimization effect on the file as you mentioned.

Sorry Ali, no the file I sent you is the original file that has the internally rotated pages and images that we are trying to convert into a new PDF that will present the same way to the user but in which the pages and the images on them are not rotated.

I won’t have a “code snippet” I can share with you as we partially revert to standard .NET functionality to rotate the images after extracting them out of the PDF and another product to recombine these into a new PDF.

This may be part of our problem, and if you would have a way to achieve the same result fast with pure “Aspose” code I would be very grateful. I tried to achieve that but I haven’t been able to work out how.

Hello Ali,

Sometimes it helps for one to take a step back and reconsider what one is doing…

If you can provide me with a straightforward piece of code to do what I originally asked here, that would be fine, but the reason I was asking this in the first place is because we would have to combine such rotated PDFs into a legal “brief” along with other PDFs that were “normal” PDFs without such rotations.

In our original product we would provide the user with options to insert all sorts of bespoke page numbering systems that Aspose didn’t really cater for. We would therefore put the page numbers in ourselves by writing text at the bottom of pages … and so on.

To make that work for all documents we had to twist ourselves into some very bad contortions - this being one of them.

What we lost sight of is that in some cases people want perfectly bog-standard page numbering. In fact this is now a standard to follow. Page numbers, counting from the first page in the completed document, with all page numbers in the middle of the page at the bottom.

Our “bespoke” page numbering may have been too complex but for standard page numbering systems you guys have obviously already provided solutions: Add Page Number to PDF with C#|Aspose.PDF for .NET

And when we use that method for standard page numbering, it works fine. it ALWAYS works fine. Internal page rotation doesn’t matter at all: you guys are already taking that into account.

And the PDF size stays the same (the addition of page numbers barely registers, but we don’t have to manipulate any of the images)

So like I said… it would still be nice to get a solution for this, so we may take advantage of it if somebody still insists on using some strange bespoke page numbering. But for the majority of our clients we can use your standard pagination methods instead of trying to rotate PDFs.

@kidkeogh

Thanks for further explanations. We have logged a ticket as PDFNET-51822 in our issue tracking system to investigate and address your concerns. We will look into its details and let you know as soon as we find some possibility to tackle the situation using Aspose.PDF. Please be patient and spare us some time.

We apologize for the inconvenience.

1 Like

Thank you very much Ali, and don’t worry about the inconvenience. My considerations in my previous reply will buy us a LOT of time and they remove the urgency from this request. At this point, this would just be a “nice to have”.

@kidkeogh

Thanks for your feedback. We will let you know in case of further updates.

Hello Ali - I sent you more information in response to your private message. While the page numbering now appears to work fine even if pages are rotated internally, the hyperlinks don’t appear to work correctly. I added pages from a rotated PDF to a “brief” and added hyperlinks to its first page in the “table of content” at the top. The document with rotated pages starts at page 3, when I create the hyperlink to that page I point it at page 3, but when I click on the hyperlink in the final document it takes me to page 4. When I do the same thing for a document without internally rotated pages, the hyperlink takes me to the correct page. Bookmarks also always take me to the correct page, even the rotated ones.

@kidkeogh

We have checked your messages and all documents that you shared with them. Can you please also share the complete code snippet which you used to merge them and finally add TOC and generated Brief.pdf document? We will test the scenario accordingly and proceed further in this regard.