New line characters(\r\n) issue in PDF in C# using Aspose.PDF for .NET error - Font doesn't include tables to decode text

When trying to identify the text fragments enclosed in square brackets and replace the [] with empty string, we are getting the below exception if the text fragment contains \r\n or \n or nbsp characters.

Error - Font doesn’t include tables to decode text
Stack trace -
at #=zRMKYxwdLwla_377mGcBGbItjUAcl.#=z$3pOA6ktFPS_(UInt32 #=zJD2sP0g=)
at #=zRMKYxwdLwla_377mGcBGbItjUAcl.#=z_2JLx_F5mLbb(UInt32 #=zJD2sP0g=)
at #=ztxquhYRowjnsZQxP8GiFPpTJwpx3QJg93V2QFsDU09CBI92XJxVn4ao=.#=zW_Vfbi8Nj3PByiegGg==(#=zdfPIt15GsqAtGWNwksq3JS0= #=zUw0fpKg=, String #=zPMPoiS0=, Int32 #=zxOqyBoQ=, Int32 #=zJHlnqi8=)
at #=ztxquhYRowjnsZQxP8GiFPpTJwpx3QJg93V2QFsDU09CBI92XJxVn4ao=.#=zW_Vfbi8Nj3PByiegGg==(#=zdfPIt15GsqAtGWNwksq3JS0= #=zUw0fpKg=, String #=zPMPoiS0=)
at #=ztxquhYRowjnsZQxP8GiFPpTJwpx3QJg93V2QFsDU09CBI92XJxVn4ao=.#=zISNiZ5j016Ue3U4lnq2kWet9EkvS(String #=zPMPoiS0=, Font #=zUw0fpKg=, Font& #=z$B71qFUm5SM2EGdhq6POH1s=)
at #=z$66iMVMGcSZZ6hCn5zcs8teks$4yDhsZhA==.#=z5FM23$lxFBDDOsRNCA==(Double #=z7pPY6dFIxvI$, TextState #=zMeUYjhU=, String& #=zQ6fxqec=)
at Aspose.Pdf.Text.TextFragment.#=z6z_Y4Sm0lzuS(String #=zQIf$T38=)
at Aspose.Pdf.Text.TextFragment.set_Text(String value)

Below is the code for your reference

var doc = new Aspose.Pdf.Document(path);

var searchTerm = “[(?s)(.*?)]”;
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(searchTerm);
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;

doc.Pages.Accept(textFragmentAbsorber);

TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
foreach (var textFragment in textFragmentCollection)
{
textFragment.Text = textFragment.Text.Replace(“[”, “”).Replace(“]”, “”);
textFragment.TextState.ForegroundColor = Color.Black;
textFragment.TextState.BackgroundColor = Color.Gray;
}

Note - This issue looks similar to the below ticket already raised

New line characters in PDF in C# using Aspose.PDF for .NET - System.NullReferenceException

@nraj
Please attach the document with which this happens

PFA the file used

excel-test-document.zip (1.9 MB)

@nraj

This code means downloading a pdf document, and an Excel document is attached. Or did I misunderstand something?

Actually the excel was converted to pdf but for your reference PFA the output pdf as well

pdf-204bc00e4855473d87e83e577cfb43f0.zip (596.7 KB)

@nraj
What is PFA?
Is the attached Excel document converted to a pdf document and the previously mentioned code applied to it?

PFA - Please find attached

Is the attached Excel document converted to a pdf document and the previously mentioned code applied to it? - Yes, the excel was converted to pdf but with few modifications, so you can refer directly the pdf which I have attached and ignore the excel sheet. The attached pdf should work with the above mentioned code.

@nraj
I’m sorry, I didn’t see that you attached a pdf document in the archive. What version of the library are you using? For the library version 23.10, I applied the given code to the attached file and did not receive an exception.
pdf-204-result.pdf (688.6 KB)

We are using Aspose PDF 23.9.0. We will test with version 23.10 and see if it works.
Thanks for the updates.

@nraj
The development team is now actively working on bug fixes. Glad your issue has been resolved, thanks for the feedback.

We are still facing this issue. This is a blocker for us at this moment. Can you please provide an update on this issue

@susmithaputhana
Please attach the document with which this happens and the code you use.

The problem still exists. Please find the console application that can be used to reproduce the problem, together with the document that is causing it, attached.

FontDecode.zip (1.3 MB)

@nraj
Thank you for the data provided, the problem has been reproduced - I will create a task for the development team.

@nraj
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-57280

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Could you please provide an update on the current status of the reported issue with Issue ID(s): PDFNET-57280? Your assistance in this matter would be greatly appreciated.

@nraj
Nothing new for this task yet. Created tasks are solved in the order they are received, taking into account priorities.
The highest priority is for tasks with paid support, followed by tasks from users who have purchased a license.
The time it takes to solve problems can also vary. Therefore, unfortunately, it is not even possible to give ETA.

@sergei.shibanov
Any update on this issue?

@shreyap
Nothing new yet, unfortunately.
I’ll write to you when new information appears.

@sergei.shibanov
As a workaround we wrote our custom code to replace new line characters

foreach (var textFragment in textFragmentCollection)
{
    var text = textFragment.Text;
    if (text.Contains("\r\n"))
    {
        text = text.Replace("\r\n", "").Replace("[", "").Replace("]", "");
    }
    else
    {
        text = text.Replace("[", "").Replace("]", "");
    }
    textFragment.Text = text;
    Console.WriteLine($"Text Fragment :{textFragment.Text}");
    textFragment.TextState.ForegroundColor = foregroundColor;
    textFragment.TextState.BackgroundColor = backgroundColor;
}

textFragment.Text = text;

This line is throwing error. It is not allowing to modify textFragment.Text.
Please do look into this on priority and let us know as soon as the fix is available as it is a blocker for us.
Thanks in advance.