Aspose PDF - Extract table data from specific region in a PDF

siddharthsharma819 · May 1, 2018, 9:19am

Hi,

Is it possible to extract table data from a specific area in pdf document to access it row wise.If yes, kindly share some example code. Attached is example file. I need to extract data from table under point 17. Only the grid needs to be extracted.
demo.pdf (99.6 KB)

Farhan.Raza · May 1, 2018, 6:17pm

@siddharthsharma819

Thank you for contacting support.

We would like to share with you that text can be extracted from any table on a PDF page, by using below code snippet. However, required text is not being extracted from your document. Therefore, an investigation ticket with ID PDFNET-44612 has been logged in our issue management system for further investigation. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

    Document pdfDocument = new Document(dataDir + "demo_table.pdf");

    TableAbsorber absorber = new TableAbsorber();

    absorber.Visit(pdfDocument.Pages[1]);

    // Extract Data From Each Table
    foreach (AbsorbedTable table in absorber.TableList)
    {
        foreach (AbsorbedRow row in table.RowList)
        {
            foreach (AbsorbedCell cell in row.CellList)
            {
                foreach (TextFragment fragment in cell.TextFragments)
                {
                    Console.WriteLine(fragment.Text);
                }
            }
        }
    }

We are sorry for the inconvenience.