Document embedding non-requested fonts on save

I use Aspose.PDF to produce some documents, we’ve noticed that some of them are crashing due to a missing font when we open with Adobe Reader (it gives a warning saying the “Arial” font could not be found and part of the text is not displayed). That’s a surprise for us since we don’t use Arial as the font for the decoration, so I’ve managed to reproduce what we are experiencing with the following code:

final var pdfFolderFontSource = new FolderFontSource("/fonts/");
    FontRepository.getSources().add(pdfFolderFontSource);

    var document = new Document();
    var page = document.getPages().add();
    var textFragment = new TextFragment();

    var textSegment = new TextSegment("Testing multiline\nhere is the new line");
    textSegment.getTextState().setFont(FontRepository.findFont("OpenSans-Regular"));
    textFragment.getSegments().add(textSegment);

    for (int i = 0; i < 100; i++) {
        textSegment = new TextSegment("\nTesting multiline\nhere is the new line");
        textSegment.getTextState().setFont(FontRepository.findFont("OpenSans-Bold"));
        textFragment.getSegments().add(textSegment);
    }

    page.getParagraphs().add(textFragment);
    document.save("test.pdf");

The produced document has 4 embedded fonts: Helvetica, Arial, OpenSansRegular and OpenSansBold

Some important observations:

  1. If I remove the new line character (\n), the extra fonts are not embedded (Arial and Helvetica).
  2. If the text fragment do not overflow to the next page the font Arial is not embedded

The produced document: test.pdf (51.3 KB)

image.jpg (792.1 KB)

The OpenSans font can be downloaded here: Open Sans - Google Fonts

@samuelmartinucci,

Is the document attached, the one that shows the error? Because this document opens good for me(no warning displayed).

If you do not have the arial font installed in your system it will probably error out as the Arial font is not embbeded.

So lets start by clarifying this first. Thanks for all the information you can provide.

@carlos.molina the document attached is just an example on how is Aspose.PDF embedding fonts that are not asked / needed into a document based on certain criterias (the ones described above).

I produce the PDF by running the code above in a MacOS. The same code in an AWS Lambda produces a document that is corrupted (it includes a font called Dejavu instead of Helvetica, which is not present on MacOS). If I get the document out of the Lambda output then I get the error I mentioned previously.

The point is, why are these fonts being embedded if I have no paragraph having it? Why are they included when I have a line break only?

@samuelmartinucci,

I do not know if the document was originally created with those fonts and those fragment remain there with no text. but you can do following and it will fix it:

private void Logic()
{
    var fontSource = new FolderFontSource($@"{prefixPathFonts}\Open_Sans\static\OpenSans");

    FontRepository.Sources.Add(fontSource);

    var doc = new Document($"{PartialPath}_input.pdf");

    var absorber = new TextFragmentAbsorber(new TextEditOptions(TextEditOptions.FontReplace.RemoveUnusedFonts));
    doc.Pages.Accept(absorber);

    // Perform mass operation
    absorber.ApplyForAllFragments(FontRepository.FindFont("OpenSans-Bold"));

    doc.Save($"{PartialPath}_output.pdf");
}

The input and output:
RemovingUnusedFonts_input.pdf (51.3 KB)
RemovingUnusedFonts_output.pdf (33.2 KB)

There is no input document, the code I shared is creating an in memory document from scratch and as you can see from the code, no paragraph is using Arial or Helvetica. Why are they embedded to the produced document?

@samuelmartinucci,

After several tries I was ble to determine one problem I noticed that by adding \n is adding Helvetica. I havent been able to add arial though.

Can you do me a favor and remove the \n, and check if the helvetica dissapears?

@carlos.molina,

Yes, removing \n removes Helvetica.

Arial only appears if your text fragment overflow the page it was added (and the \n is present).

Like I said, the fonts being added depends on your OS. On AWS Lambda I get Dejavu instead of Helvetica

@samuelmartinucci,

I have enough information and code to create a ticket for the dev team. Thanks for your help and patience with me.

@samuelmartinucci
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-54544

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.