I’ve seen this reported elsewhere on the forum but it wasn’t resolved on there.
Details follow in my next comment.
I’ve read that ASPOSE only works with a limited number of fonts but can you let me know if this means we can’t resolve the problem? We have no control over the PDF formats users can upload to our site so this could happen any time.
Code snippet follows. Error happens on line “pdfDocument.Pages.Accept(textFragmentAbsorber);”
We’re using a newly purchased version of the Aspose.Pdf.DLL (v23.1.1.0)
Document causing issue attached Problem PDF.pdf (927.3 KB)
using (Document pdfDocument = new Document(@"C:\Users\paul.jones\Downloads\Problem PDF.pdf"))
{
//create TextAbsorber object to find all instances of the input search phrase
//using regex ({[^}]*}) ---- (#[^#]*#) -- #SPG#
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("(#[^#]*@#)");
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;
//accept the absorber for all the pages
pdfDocument.Pages.Accept(textFragmentAbsorber);
//get the extracted text fragments
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
//loop through the fragments
foreach (TextFragment textFragment in textFragmentCollection)
{
//update text and other properties
textFragment.Text = "1";
textFragment.TextState.Font = FontRepository.FindFont("Arial");
textFragment.TextState.FontSize = 12;
}
}
@pjjonah66,
I was able to reproduce the error with the newest version 23.2.
I will be creating a bug report.
@pjjonah66
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-53746
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
@pjjonah66,
Sorry for the late response but I just got a response from the dev team. Here it is:
Hello.
The reason for the exception is incorrect data on pages seven and nine.
There is no description of the “F1” font on the page, but there is a call to it on the page itself.
Your situation can be corrected using the following code.
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
// IgnoreResourceFontErrors - used to ignore missing font
textSearchOptions.IgnoreResourceFontErrors = true;
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("(#[#]*@#)", textSearchOptions);
// now the Accept method works correctly
doc.Pages.Accept(textFragmentAbsorber);