We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

How to extract text

Hello,

I am using this code to extract text from PDF.
How to extract text to memory instead of a file?

    String dataDir = "E:\\PdfToTxt\\";
    PdfExtractor extractor = new PdfExtractor();
    extractor.bindPdf(dataDir + "\\Teste.pdf");
    extractor.extractText();
    extractor.getText(dataDir + "\\Teste.txt");

@cpatricio76

Thank you for contacting support.

I would like to request you to please try using below code snippet, along with Aspose.PDF for Java 18.2 in your environment. It extracts all the text from a PDF document as a String in the memory and you may manipulate that String as per your requirements.

    // Open document
    Document pdfDocument = new Document(dataDir + "Test.pdf");

    // Create TextAbsorber object to extract text
    TextAbsorber textAbsorber = new TextAbsorber(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Raw));
    // Accept the absorber for all the pages
    pdfDocument.getPages().accept(textAbsorber);

    // Accept the absorber for particular PDF page
    //pdfDocument.getPages().get_Item(1).accept(textAbsorber);

    // Get the extracted text
    String extractedText = textAbsorber.getText();

    System.out.println(extractedText);

I hope this will be helpful. Please let us know if you need any further assistance.