PDF file fonts and formats not matching the word format provided after generation

@alexey.noskov I have just attached the bare minimum fonts I am copying for testing purposes.

fonts.zip (3.2 MB)

per my latest changes:

public static void printFontInfo1(Document document)
{
    log.info("Number of system font resources" + document.getFontSettings().getFontsSources().length);
    for (FontSourceBase src : document.getFontSettings().getFontsSources())
    {
        log.info("The source is::::>" + src.toString());
        log.info("Priority is:::>" + src.getPriority());
        log.info("Type of font folder is:::>" + src.getType());
        for (PhysicalFontInfo fontInfo : src.getAvailableFonts())
            log.info(fontInfo.getFullFontName());
    }
}

the corresponding log: (See the fonts size when I do src.getAvailableFonts() is always 0. While the directory has the fonts I have attached here. The other logs come through fine (in fact I set the priority to 100 for debug purposes)

    2022-08-05 08:11:20.144  INFO 22133 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : The source is::::>com.aspose.words.FolderFontSource@25b75231
    2022-08-05 08:11:24.533  INFO 22133 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Priority is:::>100
    2022-08-05 08:11:26.894  INFO 22133 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Type of font folder is:::>1
    2022-08-05 08:15:26.080 DEBUG 22133 --- [   scheduling-1] o.s.jdbc.core.JdbcTemplate               : Executing SQL query [SELECT 'Hello' from DUAL]
    2022-08-05 08:15:26.259  INFO 22133 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Times New Roman' has not been found. Using 'Fanwood' font instead. Reason: first available font.
    2022-08-05 08:15:26.364  INFO 22133 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Book Antiqua' has not been found. Using 'Fanwood' font instead. Reason: first available font.
    2022-08-05 08:15:26.595  INFO 22133 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Symbol' has not been found. Using 'Fanwood' font instead. Reason: first available font.
    2022-08-05 08:15:26.595  INFO 22133 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Courier New' has not been found. Using 'Fanwood' font instead. Reason: first available font.

and from that point onwards, the final fallback font Fanwood is used.

BTW, after trying many different options, this is what I use to set the folderfont source (remember, it doesnt matter if I use Streamfontsource all the same result)

try
{
    log.info("generatePDFAspose setting fontsettings");
    log.info("Calling the warningcallback in PDFBoxPDFGenerator generatePDFAspose");
    FontSettings fontSettings = new FontSettings();

    template.setWarningCallback(new AsposeDocumentToPDDocumentConverter.FontSubstitutionWarningCollector());
    PdfSaveOptions pdfSaveOptions = new PdfSaveOptions();
    pdfSaveOptions.setFontEmbeddingMode(0);
    pdfSaveOptions.setEmbedFullFonts(true);
    pdfSaveOptions.setUseCoreFonts(false);

    fontSettings.setFontsSources(
            new FontSourceBase[] {
                    new FolderFontSource(TMP_DIR + "/fonts", true, 100),
            }
    );
    template.setFontSettings(fontSettings);

    log.info("Printing system folders...after setting template.setFontSettings....");
    printFontInfo1(template);

    template.save(pdfOutputStream, pdfSaveOptions);

    template.setWarningCallback(new AsposeDocumentToPDDocumentConverter.FontSubstitutionWarningCollector());
}
catch (Exception e)
{
    log.info("generatePDFAspose()" + e);
}

Please let me know if you have more questions.

@sathsy Thank you for additional. The problem is with your fonts. They all are broken.

Looks like font files were broken either upon packaging them into JAR file, or upon extracting them.

@alexey.noskov, the picture is not accessible. Can you please reattach it?

Ok, I am validating the font files from my mac FontViewer and copying them again.

@sathsy I have changed the topic ownership. Now you should be able to see the image. It is a simple Windows error message screenshot that states “The requested file is not a valid font file”.

Do you think that is the reason that the src.getAvailableFonts in the above attached code returns no fonts while regular directory listing in code is able to list / return the fonts?

@sathsy Aspose.Words simply does not recognize the listed files as valid fonts. So they are not listed. FontSourseBase.getAvailableFonts returns the list of fonts Aspose.Words is able to read as fonts, but not a list of files.

@alexey.noskov thanks for working with us persistent.

So after validating the fonts in my local MAC, the fonts are listed fine. However in PCF it is the same story.

  1. 1st log is from local: (expected to miss a couple of fonts)
     2022-08-05 09:22:04.557  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : The source is::::>com.aspose.words.FolderFontSource@58d268dc
    2022-08-05 09:22:05.815  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Priority is:::>100
    2022-08-05 09:22:06.919  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Type of font folder is:::>1
    2022-08-05 09:22:35.793  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Arial Bold Italic
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Arial Bold
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Arial Italic
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Arial
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Times New Roman Bold Italic
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Times New Roman Bold
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Times New Roman Italic
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Times New Roman
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Verdana Bold Italic
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Verdana Bold
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Verdana Italic
    2022-08-05 09:22:35.794  INFO 30982 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Verdana
    2022-08-05 09:22:38.043  INFO 30982 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Symbol' has not been found. Using 'Arial' font instead. Reason: font info substitution.
    2022-08-05 09:22:38.044  INFO 30982 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Courier New' has not been found. Using 'Arial' font instead. Reason: font info substitution.

and here is the PCF where the fonts are not being read again!!!

    2022-08-05T09:33:21.86-0700 [APP/PROC/WEB/0] OUT 2022-08-05 16:33:21.860  INFO 33 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : The source is::::>com.aspose.words.FolderFontSource@6a26daea
       2022-08-05T09:33:21.86-0700 [APP/PROC/WEB/0] OUT 2022-08-05 16:33:21.860  INFO 33 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Priority is:::>100
       2022-08-05T09:33:21.86-0700 [APP/PROC/WEB/0] OUT 2022-08-05 16:33:21.860  INFO 33 --- [enerContainer-1] c.s.s.o.p.c.a.DocumentResponseExtractor  : Type of font folder is:::>1
       2022-08-05T09:33:21.88-0700 [APP/PROC/WEB/0] OUT 2022-08-05 16:33:21.888  INFO 33 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Arial' has not been found. Using 'Fanwood' font instead. Reason: first available font.
       2022-08-05T09:33:22.09-0700 [APP/PROC/WEB/0] OUT 2022-08-05 16:33:22.092  INFO 33 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Times New Roman' has not been found. Using 'Fanwood' font instead. Reason: first available font.
       2022-08-05T09:33:22.21-0700 [APP/PROC/WEB/0] OUT 2022-08-05 16:33:22.213  INFO 33 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Symbol' has not been found. Using 'Fanwood' font instead. Reason: first available font.
       2022-08-05T09:33:22.21-0700 [APP/PROC/WEB/0] OUT 2022-08-05 16:33:22.214  INFO 33 --- [enerContainer-1] .c.c.AsposeDocumentToPDDocumentConverter : Font 'Courier New' has not been found. Using 'Fanwood' font instead. Reason: first available font.

@alexey.noskov - Good news. The culprit was maven!!! when filtering resources it was corrupting the TTF / TTC files. With this simple configuration, we are able to embed fonts and get the correct fonts in PCF.

For anyone in future having this issue, here is what is needed if you are using maven:

<plugin>
	<artifactId>maven-resources-plugin</artifactId>
	<version>3.2.0</version>
	<configuration>
		<encoding>UTF-8</encoding>
		<nonFilteredFileExtensions>
			<nonFilteredFileExtension>ttf</nonFilteredFileExtension>
			<nonFilteredFileExtension>ttc</nonFilteredFileExtension>
		</nonFilteredFileExtensions>
	</configuration>
</plugin>

Other than that, reference the code to add a LocalFontSource and that should do the trick:

fontSettings.setFontsSources(
    new FontSourceBase[] {
            new FolderFontSource(TMP_DIR + "/fonts", true, 100),
    }
);
document.setFontSettings(fontSettings); //Aspose word document - this is needed before converting to PDF.

Thank you so much for your constant support and help @alexey.noskov. Much appreciated!

@sathsy It is perfect that you managed to resolve the problem and thank you for sharing your solution. Please feel free to ask in case of any issues we will be glad to help you.

1 Like