DocumentBuilder.insertHtml how to keep FontFamily even if not available for Aspose

Is there a way to insert HTML fragment with FontFamily into a document, Font not available for Aspose without changing it with Arial or other subtitutes?

Example:

<p style="margin-top:0pt; margin-bottom:0pt; font-size:11pt">
	<span style="font-family:Calibri">Text</span>
</p>

I couldn’t paste the HTML as it was, I added “!!” in the last tag.

If I don’t have Calibri installed on the system where Aspose runs, it changes it to Arial.

@federico.mameli Font family is preserved:

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.insertHtml("<p style=\"margin-top:0pt; margin-bottom:0pt; font-size:11pt\"><span style=\"font-family:Calibri\">Text</span></p>");
doc.save("C:\\Temp\\out.docx");

But if you save the document to fixed document format such as PDF, the font will be substituted if the specified font is not available. The fonts are required to build document layout. If Aspose.Words cannot find the font used in the document, the font is substituted . This might lead into fonts mismatch and document layout due to the different fonts metrics. You can implement IWarningCallback to get notifications when font substitution is performed.
Please see our documentation to learn where Aspose.Words looks for fonts:
https://docs.aspose.com/words/java/specifying-truetype-fonts-location/

Even if I save it to docx the font is substiuted.
In my case I do not need to save it in any case, I use a document just to re-extract the HTML.

I try to explain it better…
I extract the HTML regarding a specific namedRange from an xlsx document using Aspose.cells, let’s call it ExcelHTML
The ExcelHTML is complicated, with javascript a css I don’t need.
I noticed that if I use that ExcelHTML inserting it into a document and then I extract it again like:

Code

NodeCollection childNodes = document.getChildNodes(NodeType.TABLE, true);
Node node = childNodes.get(0);
return node.toString(SaveFormat.HTML);

I can obtain a compact HTML, WordHTML, more suitable for my purposes

So, even if I don’t save the document the WordHTML have fonts substituted

@federico.mameli Unfortunately, I cannot reproduce the problem on my side. I have used the following simple code for testing:

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.insertHtml("<p style=\"margin-top:0pt; margin-bottom:0pt; font-size:11pt\"><span style=\"font-family:Calibri\">Text</span></p>");
String extractedHtml = doc.toString(SaveFormat.HTML);

Here is the extracted HTML:

<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <meta http-equiv="Content-Style-Type" content="text/css" />
    <meta name="generator" content="Aspose.Words for Java 24.4.0" />
    <title></title>
</head>
<body style="font-family:'Times New Roman'; font-size:12pt">
    <div><p style="margin-top:0pt; margin-bottom:0pt; font-size:11pt"><span style="font-family:Calibri">Text</span></p><p style="margin-top:0pt; margin-bottom:0pt"><span>&#xa0;</span></p></div>
</body>
</html>

As you can see in the extracted HTML font is still Calibri.

I don’t know if in my case the problem is the HTML (the table for example) or the Font.

Are you sure you haven’t Calibri installed on the system you run the test?

Can you try with the attached test? I replaced the original font with a NotExistingFont, to be sure.

Thanks,
Federico

MiniAsposeTest.zip (4.3 KB)

@federico.mameli I have tested with non-existing font and still everything works as expected:

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.insertHtml("<p style=\"margin-top:0pt; margin-bottom:0pt; font-size:11pt\"><span style=\"font-family:SomeNotExistingFont\">Text</span></p>");
String extractedHtml = doc.toString(SaveFormat.HTML);

output HTML

<html>
<head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <meta http-equiv="Content-Style-Type" content="text/css" />
    <meta name="generator" content="Aspose.Words for Java 24.4.0" />
    <title></title>
</head>
<body style="font-family:'Times New Roman'; font-size:12pt">
    <div><p style="margin-top:0pt; margin-bottom:0pt; font-size:11pt"><span style="font-family:SomeNotExistingFont">Text</span></p><p style="margin-top:0pt; margin-bottom:0pt"><span>&#xa0;</span></p></div>
</body>
</html>

I have tested with your HTML and managed to reproduce the problem with the following simplified HTML:

<html>
<head>
    <style>
        .font8 {
            font-family: NotExistingFont,sans-serif;
        }
    </style>
</head>
<body>
    <p>
        <font class="font8"">Some text</font>
    </p>
</body>
</html>

If change font style like this

.font8 {
    font-family: NotExistingFont
}

The problem is not reproducible.

Ok, it makes sense.
So the problem seems to happen only if HTML produced by Aspose.cells expresses font-family in the form “Font1, Font2;” and Font1 (or maybe both) is not available for Aspose.words.

Well, I’ll try to ask Aspose.cells team about this.
Thanks for now

1 Like