Extracting Pdf to jpeg images conversion issues

Hi,

I am facing issues in extracting pdf to jpg images

  1. In following pdf file it forms multiple copies of a image(equal to number of pages) and all the images do not get converted.
    file-sample_150kB.pdf (139.4 KB)

  2. In following pdf file there are no images but it forms a number of blank jpg images(equal to number of pages in the pdf file)programs.pdf (281.9 KB)

Here is the code I am using:
for (int pageCount = 1; pageCount <= document.getPages().size(); pageCount++)
{
Page page = page = pages.get_Item(pageCount);
XImageCollection images = images = page.getResources().getImages();
for(int imageCount = 1; imageCount <= images.size(); ++imageCount) {
XImage xImage = images.get_Item(imageCount);
zos.putNextEntry(new ZipEntry(fileNameWithOutExtension+pageCount+imageCount+".jpg")); //c
xImage.save(zos);
zos.closeEntry();
}
}

@MathiasT

Thank you for contacting support.

For the first issue, a ticket with ID PDFNET-47304 has been logged in our issue management system for further investigation and resolution. However, for the second file, it contains several other images including inline images. You may check the same with Preflight by listing page objects or with below code snippet:

// Open document
Document pdfDocument = new Document(dataDir + "programs.pdf");
pdfDocument.Save(dataDir + "test.xml" , SaveFormat.MobiXml);

Likewise, below code snippet saves hundreds of images from this PDF document.

// Open document
Document pdfDocument = new Document(dataDir + "programs.pdf");
int counter = 0;
foreach (Page page in pdfDocument.Pages)
{
    foreach (XImage xImage in page.Resources.Images)
    {
        counter++;
        FileStream outputImage = new FileStream(dataDir + "output_" + counter + ".jpg", FileMode.Create);

        // Save output image
        xImage.Save(outputImage, System.Drawing.Imaging.ImageFormat.Jpeg);
        outputImage.Close();
    }
}

Hi,

Thank you for the reply. But I am using Aspose Java. Can you please provide code snippet for java. I would also like you to log ticket as per as per aspose java. You have logged ticket as PDFNET-47304.

Thanks

@MathiasT

Thank you for the information.

Below are the code snippets for your kind reference.

// Open document
Document pdfDocument = new Document(dataDir + "programs.pdf");
pdfDocument.save(dataDir + "test.xml" , SaveFormat.MobiXml);
// Open document
Document pdfDocument = new Document(dataDir + "programs.pdf");
int counter = 0;
for (Page page : pdfDocument.getPages())
{
    for (XImage xImage : page.getResources().getImages())
    {
        counter++;
        OutputStream outputImage = new FileOutputStream(dataDir + "output_" + counter + ".jpg");

        // Save output image
        xImage.save(outputImage, ImageType.PNG);
    }
}

Moreover, the ticket has been logged as PDFJAVA-39010 and has been linked with this thread. We will let you know once any update will be available in ths regard.

@MathiasT

Can you please try using Aspose.PDF for Java 21.11 and let us know in case you still face this issue?