Extracts sequence of '\0' characters instead of expected text from PDF

bgrebil · May 23, 2019, 5:05pm

Using Aspose.PDF for .NET 19.1.0

In the attached example, we are extracting all the text on the page using the TextFragmentAbsorber class. However, instead of getting the text “File What? Too confusing.” we are getting a sequence of null character values. All other text fragments are extracting correctly.

I suspect it has something to do with the font that this text is written out in as it is different from all the others on the page.

Farhan.Raza · May 24, 2019, 12:47am

@bgrebil

Thank you for contacting support.

We have worked with the data shared by you and have been able to reproduce the issue in our environment. A ticket with ID PDFNET-46440 has been logged in our issue management system for further investigations. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.