Font and spacing breaks after passing through TextFragmentAbsorber

Product Version: Aspose.PDF for .NET 22.1.0

I am experiencing an issue where the font and general layout of a PDF document becomes corrupted after passing through TextFragmentAbsorber. Please see attached screenshot for an example.

image.png (13.2 KB)

I am using the following code as per the example provided in the docs for TextFragmentAbsorber (TextFragmentAbsorber | Aspose.PDF for .NET API Reference):

// Open document
Document doc = new Document(@"D:\Tests\input.pdf");

// Create TextFragmentAbsorber object to find all "hello world" text occurrences
TextFragmentAbsorber absorber = new TextFragmentAbsorber("hello world");

// Accept the absorber for first page
doc.Pages[1].Accept(absorber);

// Change text and font of the first text occurrence
absorber.TextFragments[1].Text = "hi world";

// Save document
doc.Save(@"D:\Tests\output.pdf");

I have ensured that any custom fonts are installed on the machine that Aspose.PDF is running on, even though I would imagine this wouldn’t matter with a PDF document. I’ve also noticed that the example code in the documentation explicitly sets the font to Arial – presumably if I just change the text of the fragment without touching the font attribute then it should retain the original styling as specified in the PDF document? I’ve had a read through the documentation, and it seems that there is a FontSource (FontSource | Aspose.PDF for .NET API Reference) class that could be used towards this end, but I will not be able to install every custom font ahead of time so this doesn’t seem like a viable solution.

Is there some kind of special magic to make this work, as I would imagine it’s a fairly common use case? Are there perhaps any special attributes that should be taken into consideration when creating the original PDF document to prevent this from happening? Is it perhaps related to the font itself, e.g does it matter if it’s TrueType? Any help would be much appreciated.

@Dev123456789

You can use FontRepository.FindFont method to set the font of text. Please check the code example shared in following article.
Replace Text in PDF

If you still face problem, please share your input PDF and custom font here for testing. We will investigate the issue and provide you more information on it.

Thanks for your reply, I have done some further investigation and found a couple of interesting data points.

Firstly, the PDF document definitely has the font embedded inside the document – I verified this by checking the document metadata (please see attached screenshot).

image.png (52.2 KB)

Secondly, when I install this font locally on my machine, then everything looks perfectly fine the way I would expect. So I guess the crux of the issue is: why does it matter whether I have the font installed locally when it’s already embedded inside PDF? After all, I’m only changing the text of the TextFragment without touching the font or any of the styling. Unfortunately I am not in a position to install every font on the machine ahead of time, so ideally I need a way to tell Aspose.PDF to use the embedded font instead searching on my local machine when I’m replacing the text of the PDF fragment.

@Dev123456789

Could you please attach your input PDF that has embedded fonts here for testing? We will investigate the issue and provide you more information on it.

Please try with any text on this page.

fonts_issue_reproducible_example.pdf (83.0 KB)

@Dev123456789

We have logged this problem in our issue tracking system as PDFNET-51766. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

Thank you for your help with this ticket, may I please ask if there have been further updates since we last spoke?

@Dev123456789

Unfortunately, there is no update available on this issue at the moment. We will be sure to inform you via this forum thread once there is an update available on it.

A post was split to a new topic: Font and spacing issue after passing through TextFragmentAbsorber