Using Aspose.PDF for .NET, I want to access each text fragment of a PDF document, whether it be a paragraph, text in a table, or captions, and then update the text based on some criteria and write back to the PDF in the same place. So the overall structure of the PDF remains same and I only access and update sentences of the document. Please tell me how I can do that and share some code sample.
Thanks
We recommend using TextFragmentAbsorber.
Here is an example of how to use this feature:
using Aspose.Pdf;
using Aspose.Pdf.Text;
using System;
namespace Documentation.Advanced.Working_with_Text
{
internal class ForumExamples
{
public static void FindAllText()
{
// Open document
Document pdfDocument = new Document(@"C:\Samples\Sample-Document-01.pdf");
// Create TextAbsorber object to find text
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber();
// Accept the absorber for all the pages
pdfDocument.Pages.Accept(textFragmentAbsorber);
// Loop through the fragments
foreach (TextFragment textFragment in textFragmentAbsorber.TextFragments)
{
// Print position
// You can also use textFragment.Rectangle;
Console.WriteLine(textFragment.Position);
// Print content
Console.WriteLine(textFragment.Text);
// you can also use textFragment.Text ="Some value" for replacement
// Print used font name
Console.WriteLine(textFragment.TextState.Font.FontName);
}
}
}
}
You can find other suitable examples in the section “Replace Text in PDF” of Documentation.