ExtractText

Hi Juno,

I’m sorry to inform you that the fix is going to take a while. Our team is working on it and we’ll keep you updated with the progress.

We’re sorry for the inconvenience.
Regards,

Hello,

In the mean time, can we assume that the unicode byte order mark is going to be at the beginning of the text extracted from each page for **any** PDF?

Thanks,

Juno

Hi Juno,

I have tested different other files and found that this unicode byte order mark is present in all of these files. So, you can assume that it’ll be found at the start of each page for any PDF file. I have also updated our development team with the status and we’ll try to provide you a fix the earliest possible.

We’re sorry for the inconvenience.
Regards,

I’m not sure if this is the same problem, but we recently upgraded to Pdf.Kit 4.9 and now we get no more clear text extracts:


For example “Die Konsumeten” is something like this:
D\0i\0e\0 \0K\0o\0n\0s\0u\0m\0e\0n\0t\0e\0n\0

Is that a bug and or is there a workaround on my side? Can I convert that somehow?

Cheers

Remy

Hi Remy,

Please read this article and try to use the given code snippet. You can try the second method with Unicode encoding. I hope this is going to resolve your issue. If it still doesn’t resolve your issue then please share the problematic PDF file with us, so we could test the issue at our end.

We’re sorry for the inconvenience.
Regards,

Thanks. That seems to help.


I appreciate when you guys continually improve your products, but I assume most people don’t have the time to always read all your blogs and post, so it creates problems when you suddenly change the functionality of an existing function.
I would strongly suggest that you are trying to find a better approach. Maybe declare the function as obsolete and create a new one. Or anything similar that alarms me automatically.

Cheers

Remy

Hi Remy,

We publish release notes along with every new version which contains the details about every bug fix, new feature or improvements. We also notify users via the forums about every bug fix and feature etc. I’m afraid, you might not have noticed the release notes or related blog post. Moreover, if any function is marked as obsolete, you will notice that in the editor when you use the new component in your application. Nevertheless, we’ll try to improve the process further.

We’re sorry for the inconvenience.
Regards,