We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Read Text from PDFs

Hi,

I have use ur ASPOSe.Total product since last six months.

Now I have a requirement to extract text from the PDF files and store into Text files. How can I do this.

I am sending some PDF Files, can u please tell me how to get text in a single text file from a multipage PDF.

Reply ASAP plz. I m in need indeed.



String file = FileUpload1.PostedFile.FileName;

PdfExtractor extractor = new PdfExtractor();

extractor.BindPdf(file.Trim());

extractor.ExtractText();

String prefix = “C:\New Folder\abc”;

String suffix = “.txt”;

int pageCount = 1;

while (extractor.HasNextPageText())

{

extractor.GetNextPageText(prefix + pageCount + suffix);

pageCount++;

}

I use this code but it creates so many text files for a single PDF, but I want only one text file for one multipage PDF.

Please check ur code on the attached PDF files.


Hi Munendra,

Please use PdfExtractor.ExtractText method to save all the text from a Pdf file to a single text file.

The following code snippet can help you out:

//example: Extracts all the text from PDF file
PdfExtractor extractor = new PdfExtractor();
extractor.BindPdf(@"D:\Text\text.pdf");
extractor.ExtractText();
extractor.GetText(@"D:\Text\text.txt");

I hope this helps. if you still find any questions please do let us know.

Regard,


Hi shahzad.latif,



Your help is really appreciable and your code works fine.



But there is an issue for some PDF. I am sending you one PDF for example as an attachment.



Your code is not able to extract tex tfrom this PDF. Can u tell me how this become possibel.



Please, help will be highly appreciable.



Reply ASAP plz…

Hi Munendra,

I'll look into the issue in detail and will update you the earliest possible.

We're sorry for the inconvenience.

Regards,

Hi Munendra,

I have tested the issue in detail at my end and have noticed that the issue is with this particular file; I have tested using other files and they worked perfectly fine. Can you please share with us little bit more details regarding this file? I mean, how this file was created in the first place? or if there is some other related information. I'm sure it will help us troubleshoot the issue and help you out.

We appreciate your patience and cooperation.

Regards,

Hi Munendra,

I've tested the scenario and have noticed that, text copying is restricted over the document that you have shared. You can even notice that copy option in edit menu, is disabled. So, in order to extract the text from this file, first you need to remove the document restrictions and allow copying of text. PdfFileSecurity offers the capability to set privileges over the document. For more related information, please visit Set Privileges on PDF Document.

For your convenience, I've removed the restriction of copying text from file and have enabled this feature. The updated document is in attachment, please take a look.