PDF to docx conversion generates corrupted text

Dear Team,
I’m using a trial version of Aspose.PDF license. I’ve converted some files for testing purposes. There are many files with corrupted text after converting from PDF to docx. These corrupted files have font names like DCLTQB+OTS derived font, BKISEM+Microsoft Sans Serif, BWSGWF+OTS derived font, KESGUW+OTS derived font, GKPVAH+OTS derived font. Interestingly, after setting fonts like Arial, Calibri, and other valid fonts, the correct text comes back as in the source PDF file. I’m attaching a PDF file and converted docx file here. It would be great if you can look into it and possibly suggest some solution to avoid this text corrupted issue while conversion.
I’ve used Aspose PDF conversion portal: Convert PDF | Online and Free and Java sdk. I got the same corrupted file from both.

Env: Apple Macbook pro M1

Thanks,
Bhavesh
medium1.pdf (3.1 MB)

medium1.docx (656.8 KB)

@bhaveshkumar

Would you please share if you have msttcorefonts or similar package installed in your system? Furthermore, please share the code snippet that you have been using to perform conversions. We will log an investigation ticket and share the ID with you.

@asad.ali Attaching the code snippets used to convert pdf to docx using java.
PDFToDoc.java.zip (1.2 KB)

Also, I’ve verified Microsoft cores fonts are installed in my MacBook Pro machine by default. Additionally, I’ve Microsoft Office installed so I assume all the MS fonts will be there.

Can you reproduce the issue on your end?

@bhaveshkumar

We are checking it and will get back to you shortly.

@bhaveshkumar

Can you please check the attached output DOCX file that was generated in our environment using 24.4 version of the API? Please let us know if you see any issues in it by sharing the screenshots.
medium1.docx (655.5 KB)

@asad.ali
I still have the same issue with the attached docx file. Attaching the screenshot.
Screenshot 2024-04-19 at 9.29.07 PM.png (92.8 KB)

@bhaveshkumar

This is how the output file looks in our environment i.e. Windows
image.png (105.8 KB)

Nevertheless, we have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFJAVA-43850

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.