Convert pdf to txt

convert pdf to txt

@rabin.samanta

Thanks for contacting support.

You may please use TextAbsorber Class to extract text from PDF document and later you can save extracted text in a .txt file. Please check following .NET and Java code snippets in order to achieve the functionality:

C#.NET

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(dataDir + "test PDF.pdf");
Aspose.Pdf.Text.TextAbsorber textAbsorber = new Aspose.Pdf.Text.TextAbsorber();
pdfDocument.Flatten();
pdfDocument.Pages.Accept(textAbsorber);
string[] returnValue = textAbsorber.Text.Split(new string[] { System.Environment.NewLine }, StringSplitOptions.None);
File.WriteAllText(dataDir + "test PDF.txt", textAbsorber.Text);

Java

Document doc = new Document(dataDir + "1.pdf");
TextAbsorber absorber = new TextAbsorber(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Raw));
doc.getPages().accept(absorber);
String text = absorber.getText();
try (PrintWriter out = new PrintWriter(dataDir + "output.txt")) {
	out.println(text);
}

In case of any further assistance, please feel free to let us know.

@asad.ali
thanks