when Reading Table using TableAbsorber in Aspose.pdf. facing the issue reading when table have Merged cells it give it an Another Table
I have an OneTable It showed an More than One table I Attached the single Table PDF
Aspose merged table red as multiple table.pdf (281.3 KB)
I tried this Code using Aspose.pdf
public static void Extract_Table()
{
// Load source PDF document
Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(@"c:\tmp\the_worlds_cities_in_2018_data_booklet 7.pdf");
foreach (var page in pdfDocument.Pages)
{
Aspose.Pdf.Text.TableAbsorber absorber = new Aspose.Pdf.Text.TableAbsorber();
absorber.Visit(page);
foreach (AbsorbedTable table in absorber.TableList)
{
foreach (AbsorbedRow row in table.RowList)
{
foreach (AbsorbedCell cell in row.CellList)
{
TextFragment textfragment = new TextFragment();
TextFragmentCollection textFragmentCollection = cell.TextFragments;
foreach (TextFragment fragment in textFragmentCollection)
{
string txt = "";
foreach (TextSegment seg in fragment.Segments)
{
txt += seg.Text;
}
Console.WriteLine(txt);
}
}
}
}
}
}
I created this sample code:
private void Logic()
{
Document doc = new Document($"{PartialPath}_input.pdf");
foreach (var page in doc.Pages)
{
Aspose.Pdf.Text.TableAbsorber absorber = new Aspose.Pdf.Text.TableAbsorber();
absorber.Visit(page);
int countTable = 0;
int countRow = 0;
int countCell = 0;
int countSegment = 0;
foreach (AbsorbedTable table in absorber.TableList)
{
countTable++;
countRow = 0;
foreach (AbsorbedRow row in table.RowList)
{
countRow++;
countCell = 0;
foreach (AbsorbedCell cell in row.CellList)
{
TextFragment textfragment = new TextFragment();
TextFragmentCollection textFragmentCollection = cell.TextFragments;
foreach (TextFragment fragment in textFragmentCollection)
{
fragment.Text = $"T{countTable};R:{countRow};C:{countCell}";
fragment.TextState.FontSize = 12;
fragment.TextState.ForegroundColor = Color.Black;
//fragment.Segments.Clear();
//countSegment = 0;
//foreach (TextSegment seg in fragment.Segments)
//{
// seg.Text = $"S:{countSegment}";
//}
}
}
}
}
}
doc.Save($"{PartialPath}_output.pdf");
}
The code is simple but it is supposes to help see the tables and it did not worked properly. So I will be creating a bug for the dev team.
@vijayanathan
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-53877
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
Regretfully, there is no update yet about ticket resolution. We will surely inform you via this forum thread as soon as we have some news about its resolution. Please be patient and spare us some time.
We are sorry for the inconvenience.