We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

ExtractText() returns blank

I have attempted to use ExtractText to access the text of the attached PDF document. It returns an empty string. Can you please explain why?

My code is below:

PdfExtractor extractor = new PdfExtractor();

extractor.BindPdf(fileName);

extractor.ExtractText();

extractor.GetText("~text.tmp");

string strRet = File.ReadAllText("~text.tmp");

File.Delete("~text.tmp");

return strRet;

Hi,

We apologize for your inconvenience. I have tested the issue and I’m able to reproduce the same problem. I have logged it in our issue tracking system as PDFKITNET-5841. We will investigate this issue in detail and will keep you updated on the status of a correction.

This is an important issue for me. My evaluation cannot proceed without this fix. Can you provide a rough estimate as to how long i can expect to wait for a bug fix.

Hi,

We are working on this issue and we find that in the attached PDF file is secured, you need decrypt the file before extracting text from it. please try the following code:

PdfFileSecurity security = new PdfFileSecurity(@"d:/specialpdfs/PLNASmithPDFFlatMaybe.pdf",@"d:/specialpdfs/temp.pdf");
security.DecryptFile(""); //decrypt the pdf file with owner password(blank)
PdfExtractor extractor = new PdfExtractor();
extractor.BindPdf(@"d:/specialpdfs/temp.pdf"); //bind the decrypted file
extractor.ExtractText();
extractor.GetText(OutPath+"extractNoWords.txt");

For more details about Pdf security, please refer to here.

Thanks,