Aspose.words convert error unsupported format

convert pdf to docx/md, cause error, error message:RuntimeError: Proxy error(FileLoadException): The file cannot be opened. It might have unsupported format or be corrupted. —> InvalidCastException: Unable to cast object of type ‘_GCg’ to type ‘_z1’.
code :

document = aw.Document("ME19-4716_P8.pdf")
document.save("ME19-4716_P8.docx")

ME19-4716_P8.pdf (277.7 KB)

this is document

@David_Matin

The error message you are encountering, RuntimeError: Proxy error(FileLoadException): The file cannot be opened. It might have unsupported format or be corrupted. ---> InvalidCastException: Unable to cast object of type '_GCg' to type '_z1', suggests that the PDF file you are trying to convert may not be supported or could be corrupted.

To troubleshoot this issue, consider the following steps:

  1. Check the PDF File: Ensure that the PDF file (ME19-4716_P8.pdf) is not corrupted. You can try opening it with a standard PDF viewer to confirm that it displays correctly.

  2. Use Load Options: When loading the PDF document, you can specify PdfLoadOptions to control how the PDF is processed. This can help in cases where the PDF has specific characteristics that need to be handled differently. For example, you can set properties like PageCount, PageIndex, and Password if the document is password-protected.

    Here’s a code snippet demonstrating how to use PdfLoadOptions:

    import aspose.words as aw
    from aspose.words import PdfLoadOptions
    
    # Specify load options
    load_options = PdfLoadOptions()
    # Set any necessary options here, e.g., load a specific page
    # load_options.PageCount = 1  # Load only the first page
    
    # Load the document
    document = aw.Document("ME19-4716_P8.pdf", load_options)
    # Save to DOCX format
    document.save("ME19-4716_P8.docx")
    
  3. Supported Formats: Ensure that the PDF file is in a format that Aspose.Words can handle. Aspose.Words supports various document formats, including PDF, but there may be limitations based on the content of the PDF.

  4. Update Aspose.Words: If you are using an older version of Aspose.Words, consider updating to the latest version, as newer versions may have improved support for PDF files and bug fixes.

If the issue persists after trying these steps, you may want to reach out to Aspose support with details about the PDF file and the specific code you are using for further assistance.

Sources:
[1]: Convert PDF to Other Document Formats - Aspose Documentation

@David_Matin Could you please attach the problematic input document here for testing? We will check the issue and provide you more information.

ME19-4716_P8.pdf (277.7 KB)

@alexey.noskov no problem, here

@David_Matin
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-27530

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

The issues you have found earlier (filed as WORDSNET-27530) have been fixed in this Aspose.Words for .NET 24.11 update also available on NuGet.