Missing text/Strange text placement when converting pdf document to html

petteraas · April 22, 2015, 8:08am

See attached pdf and resulting html file from a conversion from pdf to html on Aspose.pdf version 10.2.0. There appears to be a lot of misplaced/missing text?

tilal.ahmad · April 23, 2015, 1:19am

Hi Tor,

Thanks for your inquiry. I have tested the scenario with Asopose.Pdf for Java 10.2.0 and unable to notice any issue, please find attached sample output HTML. Can you please double check and point us to exact location of the missing text?

We are sorry for the inconvenience caused.

Best Regards,

petteraas · April 23, 2015, 2:44am

Hi,

Sorry about that, but it appears that the issue is pdf to docx, not html. Can you verify?

codewarior · April 23, 2015, 6:02am

Tor Henning,

Thanks for sharing the details.

I have tested the scenario and I am able to reproduce the same problem that contents of page-3 onward are missing or replaced with strange characters. For the sake of correction, I have logged it in our issue tracking system as PDFNEWJAVA-334809. We will investigate this issue in details and will keep you updated on the status of a correction.

We apologize for your inconvenience.

aspose.notifier · May 20, 2021, 7:00pm

The issues you have found earlier (filed as PDFJAVA-34809) have been fixed in Aspose.PDF for Java 21.5.