Help with loading of pure text from a PDF file

sunsite · November 16, 2011, 5:29am

Hi.
When I try to extract pure text from pdf file it, I do not see any text on output.
I use this code:

Document pdfDocument = new Document(@“C:\temp\mypdffile.pdf”);
TextAbsorber textAbsorber = new TextAbsorber(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Raw));
pdfDocument.Pages.Accept(textAbsorber);
Console.WriteLine(textAbsorber.Text);

Can anyone advise me where I am doing wrong?
Thx.

Juri

hassan.farrukh · November 16, 2011, 5:38am

Hi Juri,

The code that you have shared seems to be correct. I can see the text extracted from PDF. May be the problem is reliant on the source file you are using.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

sunsite · November 16, 2011, 6:19am

I tried the same pdf file in the demo application Search and Get Text from Pages of PDF|Aspose.PDF for .NET and it works great.