Hi,
I am trying to PDF to DOCX using Aspose words for Java. I am using the latest version currently available which is 25.9.
But when running this code (to convert a PDF to DOCX), I get this error:
com.aspose.words.UnsupportedFileFormatException: Pdf format is not supported on this platform. Use .NET Standard or .NET 4.6.1 or greater version of Aspose.Words for loading Pdf documents.
The code snippet is as follow:
try (ByteArrayInputStream bais = new ByteArrayInputStream(bytes)) {
ByteArrayOutputStream output = new ByteArrayOutputStream(8096);
PdfLoadOptions load = new PdfLoadOptions();
load.setSkipPdfImages(true);
com.aspose.words.Document doc = new Document(bais, load);
OoxmlSaveOptions saveOptions = new OoxmlSaveOptions(SaveFormat.DOCX);
saveOptions.setMemoryOptimization(true);
doc.save(output, saveOptions);
return output.toByteArray();
} catch (Exception e) {
log.error("Aspose (words) unable to convert PDF to DOCX.", e);
}
Why is it not supported in Java? And why is it even there if it is not supported yet?
Thanks
@brissonp
Based on similar confirmed cases in the forums and retrievers, Summary
Aspose.Words for Java (including version 25.9) cannot load PDF files. When you try to create a Document from a PDF stream you will get the exception:
com.aspose.words.UnsupportedFileFormatException: Pdf format is not supported on this platform. Use .NET Standard or .NET 4.6.1 or greater version of Aspose.Words for loading Pdf documents.
Why it happens
- PDF import is only implemented in the .NET versions of Aspose.Words ( .NET Framework 4.6.1+, .NET Core/.NET 5+, and .NET Standard). The Java library does not contain the PDF‑parsing engine, so the API is present for API‑compatibility but deliberately throws the above exception on Java platforms.
- The class
PdfLoadOptions is part of the public API to keep the .NET‑Java code‑base consistent, but on Java it is a placeholder – it does not provide any functionality.
What to do next
- Use a .NET implementation – If you can run .NET code (e.g., via a micro‑service, Azure Function, or a .NET‑based backend), use Aspose.Words for .NET to load the PDF and save it as DOCX.
- Use Aspose.PDF for Java – Convert the PDF to DOCX with the PDF library, which does support PDF‑to‑DOCX conversion on Java:
com.aspose.pdf.Document pdfDoc = new com.aspose.pdf.Document(new ByteArrayInputStream(pdfBytes));
ByteArrayOutputStream out = new ByteArrayOutputStream();
pdfDoc.save(out, com.aspose.pdf.SaveFormat.DOCX);
byte[] docxBytes = out.toByteArray();
- Combine both libraries – Load the PDF with Aspose.PDF for Java, then, if you need advanced Word‑processing features, load the resulting DOCX into Aspose.Words for Java for further manipulation.
- Check the product roadmap – Aspose has indicated that PDF import may be added to the Java library in a future release. Keep an eye on the official release notes or contact Aspose support for the latest status.
So, the feature is not missing by mistake; it is simply not implemented for the Java platform yet. Use one of the alternatives above until PDF import becomes available in Aspose.Words for Java.
I could not fully verify this explanation against the retrieved sources. Please double-check the behaviour in your environment and confirm the exact SDK/version.
@brissonp Aspose.Words for java does not support loading PDF documents. This feature is supported only in .NET and Python versions of Aspose.Words.
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): WORDSJAVA-3207
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
As workaround you can use Aspose.Pdf for Java library for converting Pdf to Docx and after that use Aspose.Words for Java
https://docs.aspose.com/pdf/java/convert-pdf-to-word/#convert-pdf-to-doc
https://docs.aspose.com/pdf/java/convert-pdf-to-word/#how-to-convert-pdf-to-docx