Conversion from PDF to DOCX with Aspose.PDF results in a distorted file

Dear Aspose Team,

We use Aspose.PDF to transform the incoming PDF files to DOCX files before we process them in out system. The process that we make is quite easy, we load the PDF into a Document, and then call the Save method with the output path and saveOptions.

Please find the attached PDF file, and the distorted DOCX file as well:

PPT_PDFmemoQ確認用ダミーデータ.pdf (303.0 KB)

temp.docx (211.9 KB)

We observed that if we use in the SaveOptions the DocSaveOptions.RecognitionMode.EnhancedFlow mode, then the result is what I already attached to you. If we use the DocSaveOptions.RecognitionMode.Flow mode, then the result is OK, the conversion is not distorted, it is similar to the original PDF file. We decided to use the DocSaveOptions.RecognitionMode.EnhancedFlow 6 months ago, because we observed that this mode is more precise and proved to be better, and also resolved lot of issues in our system. But now, we have a problem where the DocSaveOptions.RecognitionMode.EnhancedFlow has some weaknesses.

Also, we could reproduce the issue with the 23.11.1 version and with the latest version of Aspose.PDF as well.

Is there a setting that we miss or is this a real issue?

We are waiting for your findings/answers.

Kind regards,
Varga Matild
PPT_PDFmemoQ確認用ダミーデータ.pdf (303 KB)
temp.docx (212 KB)

@vargamatild

We have generated an investigation ticket as PDFNET-57567 in our issue tracking system to further analyze this case. We will look into its details and let you know as soon as it is resolved. Please be patient and spare us some time.

Thank you @asad.ali !

Hi @asad.ali !

Any updates on this issue?

Thanks,
Varga Matild

@matild

We are afraid that the ticket has not been reviewed yet. We will prioritize it on a first come first serve basis and as soon as we make some progress towards its resolution, we will let you know. Please spare us some time.