We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Very slow access to images in PDF files

I’m working with a set of PDF files, extracting the images included on each page, and processing the images in various ways. However, I find that the extraction of the images is a very slow bottleneck in the process. Specifically, in my PDF files, extracting each image takes over five seconds.

Note that I’m not even saving the images to disk; I’m just extracting them to a MemoryStream.

I’ve tried using PdfExtractor, like this:
PdfExtractor extractor = new PdfExtractor();
extractor.StartPage = curpage;
extractor.EndPage = curpage;
Bitmap extractorImageBitmap = new Bitmap(memoryStream);

Similarly, I’ve tried taking the image directly out of the Resources, like this;
XImage xImage1 = doc.Pages[curpage].Resources.Images[1];
xImage1.Save(memoryStream, System.Drawing.Imaging.ImageFormat.Tiff);
Bitmap xImageBitmap = new Bitmap(memoryStream);

However, in both cases, the critical routine (GetNextImage() or Save()) takes over 5 seconds. Also, I’ve tried using various Image Formats (Tiff, Png, Bmp), and all of them incur a similar delay.

The original resolution of the images is about 3170x5471 (a 600dpi scan of a printed page).

Is there any way to speed up access to the images?

Hi there,

Thanks for your inquiry. You may try ImagePlacementAbsorber class for searching and extracting images. Hopefully it will help you to speed up the process. If issue persist then please share your sample document here, so we will investigate it and will provide you more information.

Please feel free to contact us for any further assistance.

Best Regards,