PDF to DOCX fonts contain unwanted prefixes (e.g. NJNRFM+TimesNewRomanPS-BoldMT and JUORIA+Helvetica)

tomtothay · June 1, 2021, 9:49am

PDF to DOCX conversions contain unusual font names, with 6 character prefixes, e.g. NJNRFM+TimesNewRomanPS-BoldMT
JUORIA+Helvetica

The prefixes in the font names can be seen when viewing/editing the converted document in Word. The Word document content looks fine, but the prefixes are unwanted as they cause issues when converting the same document to HTML (using other 3rd party libraries)

Steps:

This issue can be reproduced using the latest version of the aspose.pdf.dll (.NET) and also using the online test converter website: Convert PDF | Online and Free
This example CV (found online) exhibits the issue but various sample PDF documents I have also tested with have similar issues.
After conversion, click on various parts of the document and notice the font has unusual prefixes, e.g. in the sample doc linked above, the line of text An example of a good CV has the font NJNRFM+TimesNewRomanPS-BoldMT

An old/similar issue was reportedly fixed

Any guidance or help is greatly appreciated.

asad.ali · June 1, 2021, 7:29pm

@tomtothay

The ticket which was logged to address the similar issue is PDFNET-46242 and it has not been yet resolved. We have linked it with this forum thread as well so that you will be notified once it is fixed. Please be patient and spare us some time.

We are sorry for the inconvenience.