Not Flat.pdf (584.5 KB)
text extracting from pdf is not readable. see output Screenshot 2023-08-09 220154.png (215.9 KB)
Not Flat.pdf (584.5 KB)
text extracting from pdf is not readable. see output Screenshot 2023-08-09 220154.png (215.9 KB)
Would you please also share the code snippet that you are using to extract the text? We will test the scenario in our environment and address it accordingly.
Document pdfDocument = new Document(pdfPath);
pdfDocument.Repair();
pdfDocument.Flatten();
TextAbsorber textAbsorber = new TextAbsorber();
textAbsorber.ExtractionOptions = new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Pure);
// Setting scale factor to 0.5 is enough to split columns in the majority of documents
// Setting of zero allows to algorithm choose scale factor automatically
textAbsorber.ExtractionOptions.ScaleFactor = 0.5; /* 0; */
pdfDocument.Pages.Accept(textAbsorber);
String extractedText = textAbsorber.Text;
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-55259
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.