Non-Latin character sets

We cannot get Aspose.PDF.Kit to convert PDFs to text correctly when the PDF contains non-Latin character sets. An example PDF

This is a limitation of Aspose.Pdf.Kit. Although we have logged this issue as PDFKITNET-3951, I don’t think we can support it in short time. Sorry for the inconvenience.

Can you please provide your pdf document and let us investigate it?

Hi Robert,


Thanks for your patience.

Please try using the code snippet shared over Extract Text from all the Pages using Text Device with upcoming release version of Aspose.Pdf for .NET 7.1.0. During my testing, all the contents are properly being extracted. For your reference, I have also attached the resultant TXT file containing extracted text. We are sorry for this delay and inconvenience.