Images are vertically flipped when using Images.Replace

I’m using the following code to go through all images in a pdf and convert them to jpeg. inputStream contains the pdf and outputStream returns a pdf with converted images. The weird thing is that all the images are flipped afterwards. Therefore, I have to (counter)flip the images during the conversion process. Anybody who knows why the images are flipped? By the way - if anybody knows of a better way to convert all images in a pdf, then please let me know.


I use version 6.1.0.0 of aspose.pdf.

        Public Shared Function ConvertPDFToPDF(ByRef inputStream As Stream) As MemoryStream
        <span style="color: blue; ">Dim</span> pdfEditor <span style="color: blue; ">As</span> <span style="color: rgb(43, 145, 175); ">PdfFileEditor</span> = <span style="color: blue; ">New</span> <span style="color: rgb(43, 145, 175); ">PdfFileEditor</span>()
        <span style="color: blue; ">Dim</span> outputStream <span style="color: blue; ">As</span> <span style="color: blue; ">New</span> <span style="color: rgb(43, 145, 175); ">MemoryStream</span>()
        <span style="color: blue; ">Try</span>
            <span style="color: blue; ">Dim</span> pdfExtractor <span style="color: blue; ">As</span> <span style="color: blue; ">New</span> <span style="color: rgb(43, 145, 175); ">PdfExtractor</span>()
            pdfExtractor.BindPdf(inputStream)

            <span style="color: blue; ">Dim</span> imageDescriptionList() <span style="color: blue; ">As</span> <span style="color: rgb(43, 145, 175); ">ImageDescription</span> = pdfExtractor.GetImageDescriptions()
            <span style="color: blue; ">Dim</span> imageStreams(imageDescriptionList.Count - 1) <span style="color: blue; ">As</span> <span style="color: rgb(43, 145, 175); ">MemoryStream</span>
            <span style="color: blue; ">Dim</span> imageOutStreams(imageDescriptionList.Count - 1) <span style="color: blue; ">As</span> <span style="color: rgb(43, 145, 175); ">MemoryStream</span>
            <span style="color: blue; ">Dim</span> i <span style="color: blue; ">As</span> <span style="color: blue; ">Integer</span> = 0
            pdfExtractor.ExtractImage()

            <span style="color: blue; ">While</span> pdfExtractor.HasNextImage()
                imageStreams(i) = <span style="color: blue; ">New</span> <span style="color: rgb(43, 145, 175); ">MemoryStream</span>()
                imageOutStreams(i) = <span style="color: blue; ">New</span> <span style="color: rgb(43, 145, 175); ">MemoryStream</span>()
                pdfExtractor.GetNextImage(imageStreams(i), <span style="color: rgb(43, 145, 175); ">ImageFormat</span>.Jpeg)
                imageStreams(i).Seek(0, <span style="color: rgb(43, 145, 175); ">SeekOrigin</span>.Begin)
                <span style="color: blue; ">Using</span> Image <span style="color: blue; ">As</span> System.Drawing.<span style="color: rgb(43, 145, 175); ">Bitmap</span> = System.Drawing.<span style="color: rgb(43, 145, 175); ">Bitmap</span>.FromStream(imageStreams(i))
                    Image.RotateFlip(System.Drawing.<span style="color: rgb(43, 145, 175); ">RotateFlipType</span>.RotateNoneFlipY)
                    Image.Save(imageOutStreams(i), <span style="color: rgb(43, 145, 175); ">ImageFormat</span>.Jpeg)
                <span style="color: blue; ">End</span> <font class="Apple-style-span" color="#0000ff">Using

i = i + 1
End While

                i = 0
Dim document As New Aspose.Pdf.Document(inputStream)
For Each imageDescription As ImageDescription In imageDescriptionList
document.Pages(imageDescription.Page).Resources.Images.Replace(imageDescription.Index, imageOutStreams(i))
i = i + 1
Next
            <span style="color:blue;">Using</span> outputStream2 <span style="color:blue;">As</span> <span style="color:blue;">New</span> <span style="color:#2b91af;">MemoryStream</span>()
                document.Save(outputStream2)
                pdfEditor.Extract(outputStream2, 0, 9999, outputStream)
            <span style="color:blue;">End</span> <span style="color:blue;">Using</span>
        <span style="color:blue;">Catch</span> ex <span style="color:blue;">As</span> <span style="color:#2b91af;">Exception</span>
            pdfEditor.Extract(inputStream, 0, 9999, outputStream)
        <span style="color:blue;">End</span> <span style="color:blue;">Try</span>
        <span style="color:blue;">Return</span> outputStream
    <span style="color:blue;">End</span> <span style="color:blue;">Function</span>

I made a short test where I save the image before it is flipped and discovered that it’s the line

pdfExtractor.GetNextImage(imageStreams(i), ImageFormat.Jpeg)
that returns a flipped image.

I’ve found a new way of converting the images using this code:


                Dim document As New Aspose.Pdf.Document(inputStream)
For Each Page As Aspose.Pdf.Page In document.Pages
Dim i As Integer = 0
For Each Image As Aspose.Pdf.XImage In Page.Resources.Images
Using imageStream As New MemoryStream
Image.Save(imageStream)
Page.Resources.Images.Replace(i, imageStream)
End Using
i += 1
Next
Next
The main problem with this code is that the Image.Save method is incredibly slow. When I load a pdf with 1 image of 15 MB it takes more than 30 seconds to save the image to the memorystream. I saw in another case (Extracting the First Image in PDF - Extremely Slow Performance - Free Support Forum - aspose.com) that you are trying to fix this error. Do you have any idea when this will be fixed?

Hi,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Well, I checked the Image Extraction feature using Aspose.Pdf.Facades.PdfExtractor and it works fine. Following is my sample code.

'open input PDF

Dim pdfExtractor As New PdfExtractor()

pdfExtractor.BindPdf("C:\input.pdf")

'extract images

pdfExtractor.ExtractImage()

'get all the extracted images

Do While pdfExtractor.HasNextImage()

'read image into memory stream

Dim memoryStream As New MemoryStream()

pdfExtractor.GetNextImage(memoryStream)

'write to disk, if you like, or use it otherwise.

Dim fileStream As New FileStream(DateTime.Now.Ticks.ToString() & ".jpg", FileMode.Create)

memoryStream.WriteTo(fileStream)

fileStream.Close()

Loop

There are a few ambiguous points in your shared code as below:

1: Image.RotateFlip(System.Drawing.RotateFlipType.RotateNoneFlipY) 

Why are you using this line in your code? This will flip the image vertically. Could this be the reason for the flipped images in your application?

2: pdfExtractor.GetNextImage(imageStreams(i), ImageFormat.Jpeg) 

There is no overloaded method of PdfExtractor.GetNextImage in V6.1.0 which takes 2 arguments. Please make sure you are using the latest version of Aspose.Pdf for .NET.

If your issue still persists regardless of the above reasons, please share your template Pdf file with us and we will check the issue in further detail.

As per the second approach, we are working on the performance issue and I have attached your issue with already registered issue in our issue tracking system (Issue Id: PDFNEWNET-29811). We will notify you as soon as we have any news regarding this update.

Thank You & Best Regards,

1: I had to use the rotateflip method to get the image right - otherwise I got the flipped image I write about in this thread. So I still think that there is something wrong with the getnextimage method.


2: That’s weird…how can it be that the code compiles and runs without any errors? I’m sure that I’m using version 6.1.0.0 of Aspose.Pdf.

But since I found another way to convert the images, please forget about the 2 cases above. Using document.Pages to iterate through the pages and images is a much cleaner way of handling the conversion. So please just fix the performance problem - that would make me totally happy regarding this issue :slight_smile:


andersensijtsma:
1: I had to use the rotateflip method to get the image right - otherwise I got the flipped image I write about in this thread. So I still think that there is something wrong with the getnextimage method.


Hi,

Thanks for your patience. I have again tested the scenario while extracting images from one of our sample PDF document and as per my observations, the the image being extracted from PDF document are not rotated. I have tested the scenario using Aspose.Pdf for .NET 6.1.0 in VisualStudio 2005 project running on WindowsXP SP3. Can you please share the source PDF document that you are using so that we can test the scenario at our end. We apologize for your inconvenience.

andersensijtsma:
2: That's weird...how can it be that the code compiles and runs without any errors? I'm sure that I'm using version 6.1.0.0 of Aspose.Pdf.


I have doubled checked and verified that GetNextImage method does not accept two arguments. I have verified it in v6.1.0.

You may also try using the code snippet specified over following link to Extract Images from the PDF File

In case you still face any problem or you have any further query, please feel free to contact. We apologize for your inconvenience.

The issues you have found earlier (filed as 29811) have been fixed in this update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

The issues you have found earlier (filed as ) have been fixed in this update. This message was posted using BugNotificationTool from Downloads module by MuzammilKhan