Extract Image from PDF on Linux

Hi
I am using aspose-total-net 24.1.0 (python) to extract images from the pdf. Iam running the code on Debian GNU/Linux 11 (bullseye) using Python 3.11.10. However when I run the below code

document = ap.Document(“embedded_pdf_1.pdf”)

xImage = document.pages[2].resources.images[1]
outputImage = io.FileIO(“sample.jpg”, “w”)

xImage.save(outputImage)
outputImage.close()

I am getting the below error. However on Windows it is working perfectly fine

xImage.save(outputImage)
RuntimeError: Proxy error(NotImplementedException): Not supported image type

Please help

@deepakbs

It seems that you are encountering a RuntimeError when trying to extract images from a PDF using Aspose.PDF on a Linux environment. The error message indicates that the image type you are trying to save is not supported.

Here are a few steps you can take to troubleshoot and resolve this issue:

  1. Check Image Type: Ensure that the image you are trying to extract is of a supported type. Aspose.PDF supports various image formats, but some formats may not be supported on all platforms. You can check the type of the image before saving it.

  2. Use a Different Output Stream: Instead of using io.FileIO, try using a different method to create the output stream. For example, you can use open() with the ‘wb’ mode to write binary data:

    with open("sample.jpg", "wb") as outputImage:
        xImage.save(outputImage)
    
  3. Update Aspose.PDF: Ensure that you are using the latest version of Aspose.PDF for Python. Sometimes, issues are resolved in newer releases, so updating might help.

  4. Check for Dependencies: Make sure that all necessary dependencies for Aspose.PDF are installed and properly configured on your Linux system.

  5. Consult Documentation: Review the Aspose.PDF documentation for any platform-specific notes regarding image extraction. There may be additional requirements or limitations when running on Linux.

If the issue persists after trying these steps, consider reaching out to the Aspose support team or checking the Aspose forums for similar issues encountered by other users.

Feel free to ask if you need further assistance!

Sources:
[1]: Extract Images From PDF File | Aspose.PDF for .NET API Reference

None of the above suggestions is working

@deepakbs

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFPYTHON-308

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

hi , we have purchased the aspose-total-net license. Do we get the support as part of that license.

Thanks
Deepak

@deepakbs

That license comes with normal support. You need to subscribe to paid support separately in order to use it.

Any updates on this?

@deepakbs

The ticket has recently been logged and is not yet investigated. We will prioritize it on a first come first serve basis and as soon as we have some updates regarding its resolution, we will inform you. Please spare us some time.

We are sorry for the inconvenience.

Hi @asad.ali - Any chances of a reply this week?

@deepakbs

If possible, can you please provide any sample PDF with which you are facing this error in your environment? It will help us in reproducing the issue and investigate it accordingly.

example.pdf (1.3 MB)

I have attached the sample pdf

@deepakbs

Thanks for sharing the sample file. We have updated the ticket information and will let you know once we make some progress regarding its resolution.

Hi @asad.ali - any further updates?

@deepakbs

No updates are available yet. We are performing investigation and as soon as we have some results to share, we will let you know. Please spare us some time.

Hello, I have the same problem, but with .Net nuget package “Aspose.Total” version 24.11.0. My executable is running in a docker container. Docker containers are running on Linux OS. Calling the method “Aspose.Pdf.XImage.Save (Stream)” throws an exception “Not supported image type”. When executing this function to process the same pdf document on Windows, everything works fine.

@SergiiBykov @deepakbs

We have investigated this issue, the reason for this error is that libgdiplus package is not installed in the environment. To work our product in non-Windows environment, we recommend our customers to install:

  • libgdiplus package
  • package with Microsoft compatible fonts: ttf-mscorefonts-installer. (e.g. sudo apt-get install ttf-mscorefonts-installer)

These fonts should be placed in “/usr/share/fonts/truetype/msttcorefonts” directory as Aspose.PDF for Python via .NET scans this folder on Linux like operating systems.

Hi @asad.ali ,
I have already installed the following libraries

apt-get update && \
    apt-get upgrade -y && \
    apt-get install -y wget apt-transport-https software-properties-common \
    build-essential libssl-dev zlib1g-dev libncurses5-dev libncursesw5-dev \
    libreadline-dev libsqlite3-dev libgdbm-dev libdb5.3-dev libbz2-dev \
    libexpat1-dev liblzma-dev tk-dev libffi-dev curl gnupg libgdiplus libc6-dev

 wget http://ftp.uk.debian.org/debian/pool/contrib/m/msttcorefonts/ttf-mscorefonts-installer_3.8.1_all.deb && \
    dpkg -i ttf-mscorefonts-installer_3.8.1_all.deb || apt-get install -f -y

But still getting the same error

Thanks
Deepak

@deepakbs

Thanks for sharing the feedback. We will further investigate and let you know once we have some findings.