Missing text/Strange text placement when converting pdf document to html

See attached pdf and resulting html file from a conversion from pdf to html on Aspose.pdf version 10.2.0. There appears to be a lot of misplaced/missing text?



Hi Tor,


Thanks for your inquiry. I have tested the scenario with Asopose.Pdf for Java 10.2.0 and unable to notice any issue, please find attached sample output HTML. Can you please double check and point us to exact location of the missing text?

We are sorry for the inconvenience caused.

Best Regards,

Hi,


Sorry about that, but it appears that the issue is pdf to docx, not html. Can you verify?

Tor Henning,

Thanks for sharing the details.

I have tested the scenario and I am able to reproduce the same problem that contents of page-3 onward are missing or replaced with strange characters. For the sake of correction, I have logged it in our issue tracking system as PDFNEWJAVA-334809. We will investigate this issue in details and will keep you updated on the status of a correction.

We apologize for your inconvenience.

The issues you have found earlier (filed as PDFJAVA-34809) have been fixed in Aspose.PDF for Java 21.5.