Replace PDF Text inside Table using C#

Hi there,

I have a problem that I am trying to solve using Aspose.PDF text replacement feature. We have reports containing textual summary of variable lengths. The summary could be any length from a few lines to a few pages. The entire summary is contained within a row in a table of the pdf report. We are trying to use Aspose.PDF to grab the entire content of the row in a table and replace with a new paragraph of text on the fly.
I tried using
TableAbsorber tableAbsorber = new TableAbsorber(); tableAbsorber.Visit(page);
to grab the contents of the table.
Then, I grabbed the corner text of the table to get the specific table
var cornerText = table.RowList[0].CellList[0]?.TextFragments[1]?.Segments[1];
Then, if the corner text matches the name of my table, I looped through the text fragments of the table, removed them and added a new table with my new replacement text in the table.
The problem that I am facing with this approach is that, when the contents of my original table expands multiple pages, it also grabs the header and footer texts of those pages and removes them as well. I was wondering if there is a better way to replace a chunk of texts that can expand multiple pages without messing up with the header and footer contents of the pdf.

@rdahal

We need to proper investigate your requirements as it seems a complex scenario where PDF internal structure is getting effected by the replacement of text inside table. Can you please share your sample PDF document along with the complete sample code snippet that you are using? We will test the scenario in our environment and address it accordingly.

@asad.ali
I am attaching my original pdf as well as the updated pdf here.
Also this is my code

foreach (var page in pdfDocument.Pages)
                {
                    TableAbsorber tableAbsorber = new TableAbsorber();
                    tableAbsorber.Visit(page);

                    double SearchString_XIndent = 0;

                    double SearchString_YIndent = 0;

                    foreach (var table in tableAbsorber.TableList)
                    {
                        var cornerText = table.RowList[0].CellList[0]?.TextFragments[1]?.Segments[1];

                        // Is there a better way to grab the text fragments of the row when the row expands to the next page?
                        if (cornerText != null && cornerText.Text.StartsWith("Preliminary") || cornerText.Text.StartsWith("Patient:"))
                        {

                            if (cornerText != null && cornerText.Text.StartsWith("Preliminary"))
                            {
                                Aspose.Pdf.Table newtable = new Aspose.Pdf.Table
                                {
                                    DefaultCellBorder = new Aspose.Pdf.BorderInfo(Aspose.Pdf.BorderSide.All, .5f, Aspose.Pdf.Color.Gray)
                                };

                                SearchString_XIndent = cornerText.BaselinePosition.XIndent;
                                SearchString_YIndent = cornerText.BaselinePosition.YIndent;

                                newtable.ColumnAdjustment = ColumnAdjustment.AutoFitToWindow;
                                newtable.Alignment = HorizontalAlignment.FullJustify;

                                newtable.IsKeptWithNext = true;
                                newtable.RepeatingRowsCount = 2;
                                newtable.Left = (float)SearchString_XIndent;
                                newtable.Top = (float)(page.PageInfo.Height - SearchString_YIndent - SearchString_XIndent);

                                // Create MarginInfo object and set its left, bottom, right and top margins
                                Aspose.Pdf.MarginInfo margin = new Aspose.Pdf.MarginInfo();
                                margin.Top = 5f;
                                margin.Left = 5f;
                                margin.Right = 5f;
                                margin.Bottom = 5f;

                                //Add row1
                                Aspose.Pdf.Row row1 = newtable.Rows.Add();
                                row1.Cells.Add();
                                TextFragment heading = new TextFragment("Preliminary Findings");
                                heading.TextState.FontSize = 12;
                                heading.TextState.Font = FontRepository.FindFont("Arial Narrow");
                                heading.TextState.FontStyle = FontStyles.Bold;
                                row1.Cells[0].Paragraphs.Add(heading);

                                //Add row2
                                Aspose.Pdf.Row row2 = newtable.Rows.Add();
                                row2.Cells.Add();
                                TextFragment mytext = new TextFragment("the convergence of the digital and physical worlds, has emerged as one of the fundamental trends underlying the digital transformation " +
                                "of business and the economy. From the fitness trackers we wear to the smart thermostats we use in our homes to the fleet-management solutions that tell us when our packages " +
                                "will arrive to the sensors that promote increased energy efficiency or monitor natural disasters resulting from climate change, the IoT is now embedded in the lives of " +
                                "consumers and the operations of enterprises and governments. In 2015, the McKinsey Global Institute published a research report entitled " +
                                "The Internet of Things: Mapping the value beyond the hype.");
                                mytext.TextState.Font = FontRepository.FindFont("Arial Narrow");
                                mytext.TextState.FontSize = 8;
                                mytext.HorizontalAlignment = HorizontalAlignment.Left;
                                row2.Cells[0].Paragraphs.Add(mytext);
                                row2.Cells[0].IsWordWrapped = true;

                                //Add row3
                                Aspose.Pdf.Row row3 = newtable.Rows.Add();
                                row3.Cells.Add();
                                TextFragment mytext2 = new TextFragment("End of Preliminary Findings");
                                mytext2.TextState.Font = FontRepository.FindFont("Arial Narrow");
                                mytext2.TextState.FontSize = 8;
                                mytext2.HorizontalAlignment = HorizontalAlignment.Center;
                                row2.Cells[0].Paragraphs.Add(mytext2);

                                page.Paragraphs.Add(newtable); //Add new table
                            }

                            string txt1 = "";
                            foreach (AbsorbedRow row in table.RowList)
                            {
                                foreach (AbsorbedCell cell in row.CellList)
                                {
                                    TextFragmentCollection textFragmentCollection = cell.TextFragments;
                                    foreach (TextFragment fragment in textFragmentCollection)
                                    {

                                        foreach (TextSegment seg in fragment.Segments)
                                        {
                                            txt1 += seg.Text;
                                            seg.Text = " ";
                                        }
                                        Console.WriteLine(txt1);

                                    }
                                }
                            }
                            var mystring1 = txt1;
                            tableAbsorber.Remove(table);
                            pdfDocument.ProcessParagraphs();
                            break;
                        }
                    }
                }

text-narrative-test-original.pdf (100.8 KB)
text-narrative_updated.pdf (130.1 KB)

@rdahal

We were able to replicate the issue while testing the scenario with 22.2 version of the API. Therefore, an issue as PDFNET-51414 has been logged in our issue tracking system. We will further look into its details and keep you posted with the status of its rectification. Please be patient and spare us some time.

We are sorry for the inconvenience.

@asad.ali I was wondering if there is any update on this. Thanks!

@rdahal

The issue is sadly not yet resolved. Please note that we will investigate and resolve the issue on a first come first serve basis and inform you as soon as some significant progress is made towards ticket resolution. Please spare us some time.

We are sorry for the inconvenience.

Hi,

is there any update on this issue?

-Rashmi

@rdahal

Regretfully, the earlier logged issue has not been yet resolved. However, we will surely notify you in this forum thread once we have some updates about its fix. Please give us some time.

We are sorry for the inconvenience.

Is there any update on this yet? We are still waiting. Please let us know.

@rdahal

We regret to share that the earlier logged ticket has not been yet resolved due to other issues in the queue logged prior to it. Nevertheless, we have recorded your concerns and will surely inform you as soon as we have some definite updates about its rectification or fix ETA.

We apologize for your inconvenience.

Hi, any updates?

Thanks,
Rashmi

@rdahal

Sadly, no updates are available at the moment regarding ticket resolution. We have already recorded your concerns and will surely notify you once an update is available about issue fix. Your patience is greatly appreciated in this regard.

We are sorry for the inconvenience.