Non-Latin character sets

rhrufftx · October 23, 2007, 8:36pm

We cannot get Aspose.PDF.Kit to convert PDFs to text correctly when the PDF contains non-Latin character sets. An example PDF

forever · October 24, 2007, 9:04am

This is a limitation of Aspose.Pdf.Kit. Although we have logged this issue as PDFKITNET-3951, I don’t think we can support it in short time. Sorry for the inconvenience.

forever · October 28, 2007, 2:23am

Can you please provide your pdf document and let us investigate it?

codewarior · July 15, 2012, 9:58am

Hi Robert,

Thanks for your patience.

Please try using the code snippet shared over Extract Text from all the Pages using Text Device with upcoming release version of Aspose.Pdf for .NET 7.1.0. During my testing, all the contents are properly being extracted. For your reference, I have also attached the resultant TXT file containing extracted text. We are sorry for this delay and inconvenience.