Free Support Forum - aspose.com

ASPOSE Java - PDF to DOCX

Hi,
I tried the sample code ConvertPDFToDOCOrDOCXFormat.java from ASPOSE GitHub.

If the PDF contains a scanned image , is it possible to get any text from the image, as is and make it available in resultant DOCX as an editable text ?

@dipsarkar,

You can use Aspose.PDF in collaboration with some other OCR application supporting HOCR standards. A free google Tesseract OCR can be used, and then save the PDF in Word format. Please refer to this help topic: Converting non searchable PDF to searchable PDF document