Corrupted font information after adding text to PDF file

Hello,
we are having issues with one of the PDF files processed by our software using Aspose.Pdf library (version 18.5.0.0) (test.original.pdf). We use the library to add text to a PDF file and save it. Opening the processed document (test_corrupted.pdf) with Adobe Acrobat Reader DC (Continuous Release, version 2019.008.20081) results in two errors (1.jpg, 2.jpg) and the inserted text is displayed as a series of dots (3.jpg). This issue does not occur when other PDF files are supplied (e.g. working_sample.pdf). A fragment of code that does the inserting of text:

    var page = Document.Pages[parameters.PageNumber];
    var textBuilder = new TextBuilder(page);
    var textParagraph = new TextParagraph();
    
    foreach (var line in parameters.Lines)
    {
    	var textFragment = new TextFragment(line);
    	textFragment.TextState.Font = Aspose.Pdf.Text.FontRepository.OpenFont(parameters.FontFilePath);
    	textFragment.TextState.FontSize = parameters.FontSize;
    	textParagraph.AppendLine(textFragment);
    }
    
    textParagraph.Rectangle = Helper.GetRectangle(page.GetPageRect(false), parameters);
    textBuilder.AppendParagraph(textParagraph);

attachments.zip (157.0 KB)

Regards,
WEBCON Sp. z o.o.

@abruks

Thank you for contacting support.

PDF readers support a core of 14 fonts so that documents can be displayed the same way regardless of the platform the document is displayed on. When a PDF contains a font that is not one of the 14 core fonts, you may Embed Fonts while creating PDF documents.

If this does not resolve your problem then please share a SSCCE code and respective font as zipped file so that we may try to reproduce and investigate it in our environment.

@Farhan.Raza

Hello, thanks for the answer.

I am attaching a sample .NET project. For some reason, I was not able to upload the project containing the Aspose.PDF DLL file. Perhaps the ZIP was too large. You will have to supply the project with the file yourself. The DLL we are using is Aspose.Pdf.dll, version 18.5.0.0.

Look for the files you need to reproduce the bug in the Debug folder. There is the font file we are using along with two PDF files - one working fine and one getting corrupted while being processed. In the code, we do not set the Font.IsEmbedded flag explicitly - it is already set to “true” when adding a font from file, the way we do.

When I analyzed the corrupted file, in the PDF structure I could see the embed fields corresponding to the font we are using. You can also see it in the Fonts section of document Properties when opening the corrupted file with Adobe Reader. The font appears embedded, with one difference: it has the “Actual Font: Unknown” line. This is unlike the working document scenario - there, the processed document properties lack the peculiar remark.

I gather from this that the problem lies not in font accessibility.

PDFTextInsertSSCCE.zip (601.3 KB)

Best regards,
WEBCON Sp. z o.o.

@abruks

Thank you for sharing requested data.

We have been able to notice the problem in our environment. A ticket with ID PDFNET-45702 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.

Hi
Any update?

@abruks

Thank you for getting back to us.

We are afraid currently there are no updates regarding PDFNET-45702. However, we have recorded your concerns and will let you know as soon as any further update will be available.

@abruks

We could reproduce the issue. The source file version, which is 1.2. The other file called “keeps_working”, has version 1.3.

So, the problem here is with embedded fonts, which can be solved by converting the source file to PDF version 1.3.

We suggest using the following code snipped to solve the issue:

var strFileName = "becomes_corrupted";
var document = new Document(DataDir + strFileName + ".pdf");

// Convert document from v1.2 to v1.3 to fix embedded font issue.
document.Convert(new PdfFormatConversionOptions(PdfFormat.v_1_3));

var page = document.Pages[1];
var textBuilder = new TextBuilder(page);
var textParagraph = new TextParagraph();
var textFragment = new TextFragment("sample text");
textFragment.TextState.Font = FontRepository.FindFont("Tahoma");
textFragment.TextState.FontSize = 8f;
textParagraph.AppendLine(textFragment);

double lowerLeftX = 15d;
double upperRightX = 115d;
double upperRightY = 15d;
double lowerLeftY = 35d;
textParagraph.Rectangle = new Rectangle(lowerLeftX, lowerLeftY, upperRightX, upperRightY);
textFragment.TextState.Font.IsEmbedded = false;

textBuilder.AppendParagraph(textParagraph);

document.Save(DataDir + "Output.pdf"); 

Output.pdf (50.5 KB)