Extract content from table which has merged cells

i faced a problem, if the pdf contains meged cell table, using TableAbsorber class read it not correctly.

i found in website ,found others had same issues as well as me. and i dont know now the problem is also exist?

version : 19.1

i looked up solutions in website, found no solutions 。

but it is very very very important for me . otherwise i will find some other tools instead but i dont wat to do it ~

code:

TableAbsorber tableAbsorber = new TableAbsorber();
tableAbsorber.Visit(pdfDocument.Pages[i]);
foreach (AbsorbedTable table in tableAbsorber.TableList)
{
foreach (AbsorbedRow row in table.RowList)
{
foreach (AbsorbedCell cell in row.CellList)
{
TextFragment textfragment = new TextFragment();
TextFragmentCollection textFragmentCollection = cell.TextFragments;
foreach (TextFragment fragment in textFragmentCollection)
{
foreach (TextSegment seg in fragment.Segments)
{
tempTable.Append(seg.Text);
tempTable1.Append(seg.Text);
}

        }
        tempTable1.Append('\n');
    }
}
text = text.Replace(tempTable.ToString(), "");

}

某单位信息网络中心机房设备整修项目-捷威特智能科技有限公司-商务文件 - 缩减.pdf (288.3 KB)

begin form page 3

@allenzhang

This is the screenshot of page 3.
image.png (107.5 KB)

Would you please highlight the merged cell in this screenshot and share the current output API is giving along with the expected output you desire to have? We will further proceed accordingly. Also, please try using 24.3v of the API.