Replace text ignores embedded font

Hi,

I’m trying to replace text in a pdf. It needs to keep the font used on the text, especially if it’s an embedded font.

If I simply replace only the text, it switches the font to Times New Roman. If I try forcing it by using a FontAbsorber to find the font in the document, it replaces the text with blue boxes.

Am I doing something wrong or are there issues with embedded fonts?

Using the latest Aspose.Pdf package for .Net, this is a .Net6 project.

This is my class:

using Aspose.Pdf;
using Aspose.Pdf.Text;

namespace PDFTest
{
    internal class AsposeTest
    {
        private static Dictionary<string, string> details1 = new Dictionary<string, string>()
        {
            { "{{CONSUMER}}", "Joe Bloggs" },
            { "{{TEACHER}}", "Jane Doe"},
            { "{{DATE}}", DateTime.Now.ToString() }
        };

        public static void Main()
        {
            SetLicense();

            var inputFolder = @"C:\PDFTest\Original PDFs";
            var outputFolder = @"C:\PDFTest\Output PDFs";

            foreach (var fileInfo in new DirectoryInfo(inputFolder).GetFiles())
            {
                // Open document
                Document pdfDocument = new Document(fileInfo.FullName);

                foreach (var detail in details1)
                {
                    // Create TextAbsorber object to find all instances of the input search phrase
                    TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(detail.Key);

                    // Accept the absorber for all the pages
                    pdfDocument.Pages.Accept(textFragmentAbsorber);

                    // Get the extracted text fragments
                    TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

                    // Loop through the fragments
                    foreach (TextFragment textFragment in textFragmentCollection)
                    {
                        var fontName = textFragment.TextState.Font.FontName;
                        var tempFontSize = textFragment.TextState.FontSize;
                        var tempForegroundColour = textFragment.TextState.ForegroundColor;
                        var tempBackgroundColour = textFragment.TextState.BackgroundColor;
                        var textState = textFragment.TextState;
                        // Update text and other properties

                        var fontabsorber = new FontAbsorber();
                        fontabsorber.Visit(pdfDocument);

                        //***Test 1, by itself, replaces text with system default Times New Roman
                        textFragment.Text = detail.Value;

                        //***Test 2, replaces the text with boxes.
                        textFragment.TextState.Font = fontabsorber.Fonts.First(x => x.FontName == fontName);
                        textFragment.TextState.FontSize = tempFontSize;
                        if (tempForegroundColour != null) textFragment.TextState.ForegroundColor = tempForegroundColour;
                        if (tempBackgroundColour != null) textFragment.TextState.BackgroundColor = tempBackgroundColour;
                    }
                }

                // Save resulting PDF document.
                pdfDocument.Save(Path.Combine(outputFolder, fileInfo.Name));
            }

        }

        public static void SetLicense()
        {
            // Initialize license object
            Aspose.Pdf.License license = new Aspose.Pdf.License();
            try
            {
                // Set license
                license.SetLicense("Aspose.Pdf.NET.lic");
            }
            catch (Exception)
            {
                // something went wrong
                throw;
            }
            Console.WriteLine("License set successfully.");
        }
    }
}

Thanks,

Mike

@mike.johnson,

It does not matter if the font is embedded or not. You need to have the font installed in the machine running your code.

Hi,

Thanks for the reply.

Is that an actual limitation with PDFs? It seems odd considering that I’m pretty sure I can view the correctly formatted text in a PDF without the font being embedded, so what’s the point of embedding a font if it can’t be re-used?

If it is the case and that can’t be changed due to a limitation with PDFs, then that’s going to mess up my idea. I’m investigating how we might be able to allow customers to upload a PDF that will be an educational certificate, with text markers for teacher name, student name, date etc. The hope was that they can supply the pdf fully formed (less work for us) and we’ll just swap the text out, keeping all formatting. It’s going to be a shame if they can’t use fancy fonts of their own, and we’re hardly going to allow them to upload any to the system.

I might as well promote a bit of healthy competition as well. I’ve tested, found the same problem and asked the same question with IronPDF. Their response was basically “we know what the problem is, but it’s complicated. Hoping to fix it by end of Q3”.

@mike.johnson,

The point of embedding a font is for viewing purposes. But whenever you edit the document, you will have issues if the font is not preset. This cannot be solved because it was designed that way. You need to understand that every font has a usage agreement(licensing), and for some of them, you have to pay to even embed them on documents, and even more to distribute them for edition.

I hope that clarifies why it works that way.

Basically, fonts are not always free, and some of them are only free in the same cases. So in order to avoid legal issues, it leaves to the server(machine ) owner to deal with which font he installs, since it will be his responsibility to accept the terms of the fonts installed.

1 Like

Aha! Ok, that makes sense .I appreciate the clarification - I was totally missing the licencing issue.

Thanks!

1 Like