TableAbsorber not identifying tables without borders when converted from Word

Hello,

I am converting a Word document to PDF and using TableAbsorber to manipulate tables in the resulting PDF. I have noticed when the table in the Word document has borders then TableAbsorber can identify it in the PDF, whereas if the Word table has no border then TableAbsorber does not find any tables.

Is there any way around this? I am using Aspose.PDF 20.6 and Aspose.Words 20.9

Code sample and input and output documents are attached.

//With border

Aspose.Words.Document wordWithBorder = new Aspose.Words.Document(@“A-WithBorder.docx”);
wordWithBorder.Save(“A-WithBorder.pdf”);
Aspose.Pdf.Document pdfWithBorder = new Aspose.Pdf.Document(“A-WithBorder.pdf”);

TableAbsorber absorber = new TableAbsorber();
foreach (Aspose.Pdf.Page page in pdfWithBorder.Pages)
{
absorber.Visit(page);

 foreach (AbsorbedTable pdfTable in absorber.TableList)
 {
     //Has a table
 }

}

//Without border
Aspose.Words.Document wordWithoutBorder = new Aspose.Words.Document(@“B-NoBorder.docx”);
wordWithoutBorder.Save(“B-NoBorder.pdf”);
Aspose.Pdf.Document pdfWithoutBorder = new Aspose.Pdf.Document(“B-NoBorder.pdf”);

absorber = new TableAbsorber();
foreach (Aspose.Pdf.Page page in pdfWithoutBorder.Pages)
{
absorber.Visit(page);

 //No tables found
 foreach (AbsorbedTable pdfTable in absorber.TableList)
 {
     
 }

}

A-WithBorder.pdf (12.5 KB)

B-NoBorder.docx (13.9 KB)

B-NoBorder.pdf (12.3 KB)

A-WithBorder.docx (13.8 KB)

@jmpe

Please try using 25.4 version of the API. We’ve introduce a flag to use Flow Engine for TableAbsorber and by setting it to true, the API is able to extract the table with no borders.

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(dataDir + "B-NoBorder.pdf");
Aspose.Pdf.Text.TableAbsorber absorber = new Aspose.Pdf.Text.TableAbsorber();
absorber.UseFlowEngine = true;
Aspose.Pdf.Page page = pdfDocument.Pages[1]; // Set the page in the PDF document by page number

// Visit the page to extract tables
absorber.Visit(page);