Hi,
We are trying to extract text from Multi-column PDF. The RAW option is textAbsorber not working as it mixes up the columns, and The PURE option with Scale Factor (tried all the possible values ) is also not working.
I am now using paragraph absorber but there is no way to determine columns in pdf. What we want is to append the next columns below the first one. Is there any other way to do it?
sample PDF has been attached.
@kainat123
Can you please share what code snippet are you using? Can you please share sample code snippet for our reference so that we can test the scenario in our environment and address it accordingly.
Document pdfDocument = new Document(localPath);
ParagraphAbsorber absorber = new ParagraphAbsorber();
absorber.Visit(pdfDocument);
foreach (PageMarkup markup in absorber.PageMarkups)
{
Console.WriteLine("- Page {0}.", markup.Number);
foreach (MarkupSection section in markup.Sections)
{
StringBuilder sb = new StringBuilder();
foreach (MarkupParagraph paragraph in section.Paragraphs)
{
sb.AppendLine(paragraph.Text);
}
Console.WriteLine(sb.ToString());
}
}
and sample doc : https://drive.google.com/file/d/1n1igf_S9uht9mer8RalNsXGWRsH_sQGj/view?usp=sharing
@kainat123
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-55982
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
Can you please share the estimated time of the fix?
@kainat123
The ticket has recently been logged in our issue tracking system and it will be prioritized on first come first serve basis. As soon as we complete its investigation, we will be able to share some ETA or updates with you. Please spare us some time.
Hi, Did you get a chance to check the issue and could you please provide us any estimated timeline for the fix?
@kainat123
We are afraid that the investigation of the issue could not be completed due to other issues in the queue. Nevertheless, as shared earlier, we will surely inform you in this forum thread once we have some updates in this regard. Please spare us some time.
We are sorry for the inconvenience.