We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

How to keep pdf text when convert to docx

I am converting pdf to docx like below.
public static void convertToDocx(File pdfFile, File docxFile) throws Exception {
Document doc;

    try (FileInputStream fis = new FileInputStream(pdfFile)) {
        doc = new Document(fis);
    }

    DocSaveOptions saveOptions = new DocSaveOptions();
    saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
    saveOptions.setMode(DocSaveOptions.RecognitionMode.Flow);

    try (FileOutputStream fos = new FileOutputStream(docxFile)) {
        doc.save(fos, saveOptions);
    }
    if (docxFile.length() == 0) {
        throw new Exception("Conversion fail");
    }
}

original pdf has image with text.
origin.pdf (4.4 MB)
But converted docx has only image

How can I keep text in pdf?

@allganize

Can you please share your environment details as well e.g. API Version, OS Name and Version, JDK Version, etc.? We have tested the scenario in our environment using 21.5 version of the API and noticed that the code did not create any output and kept running for more than 20 minutes.