Product Version: Aspose.PDF for .NET 22.1.0
I am experiencing an issue where the font and general layout of a PDF document becomes corrupted after passing through TextFragmentAbsorber. Please see attached screenshot for an example.
image.png (13.2 KB)
I am using the following code as per the example provided in the docs for TextFragmentAbsorber (https://reference.aspose.com/pdf/net/aspose.pdf.text/textfragmentabsorber):
// Open document
Document doc = new Document(@"D:\Tests\input.pdf");
// Create TextFragmentAbsorber object to find all "hello world" text occurrences
TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello world");
// Accept the absorber for first page
doc.Pages[1].Accept(absorber);
// Change text and font of the first text occurrence
absorber.TextFragments[1].Text = "hi world";
// Save document
doc.Save(@"D:\Tests\output.pdf");
I have ensured that any custom fonts are installed on the machine that Aspose.PDF is running on, even though I would imagine this wouldn’t matter with a PDF document. I’ve also noticed that the example code in the documentation explicitly sets the font to Arial – presumably if I just change the text of the fragment without touching the font attribute then it should retain the original styling as specified in the PDF document? I’ve had a read through the documentation, and it seems that there is a FontSource (https://reference.aspose.com/pdf/net/aspose.pdf.text/fontsource) class that could be used towards this end, but I will not be able to install every custom font ahead of time so this doesn’t seem like a viable solution.
Is there some kind of special magic to make this work, as I would imagine it’s a fairly common use case? Are there perhaps any special attributes that should be taken into consideration when creating the original PDF document to prevent this from happening? Is it perhaps related to the font itself, e.g does it matter if it’s TrueType? Any help would be much appreciated.