Hello,
I am trying to perform some language detection on a set of pdf’s. The English one works fine but others like Chinese, Korean, Russian etc. don’t work because no text is extracted from them. I’ve created a test program with example files to demonstrate the problem. See the following link to download the sample project. Dropbox - DocLanguage.zip - Simplify your life
Thanks
@optimalosg
Please create a standalone console application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing. We will investigate the issue and provide you more information on it.
@tahir.manzoor
Sorry, forgot to add the code for the test program. If you click on the link in the original post, I have updated the zip file with source code.
Thanks
@optimalosg
You are facing the expected behavior of Aspose.PDF. If you open the PDF in Adobe writer and try to extract the text, you will not be able to extract it. Please check the attached image for more detail.
image.png (147.8 KB)