We were using Aspose.Words component, where we insert the HTML into our word template. The issue we are seeing is, let say the HTML which I’m going to insert is having font-size in pixel format, but generated document showing less font size visually.
What I understood is that, 12px font-size in HTML is not equal to 12pt in word document (rather is shows arround 8pt). What is the better way to fix that with out changing HTML font-sizes?
But, MS Word 2019 also produces a similar output when saving attached HTML file as DOCX format and Aspose.Words 21.4 tries to mimic the behavior of MS Word.
Didn’t know whether you have noticed the issue. The actual issue is related to font-size. I have clearly conveyed in the below attached image. Hope this clarify the issue:
Please check if the following workaround is acceptable for you?
string htmlString = File.ReadAllText("C:\\Temp\\expected\\input_html.html");
Document oDoc = new Document("C:\\Temp\\expected\\input.docx");
oDoc.NodeChangingCallback = new HandleNodeChanging_FontSizePointsToPixels();
FindReplaceOptions options = new FindReplaceOptions();
options.Direction = FindReplaceDirection.Forward;
options.MatchCase = false;
options.FindWholeWordsOnly = true;
options.ReplacingCallback = new WordDocReplaceHandler();
oDoc.Range.Replace("Hello", htmlString, options);
oDoc.Save("C:\\temp\\expected\\21.4.docx");
public class HandleNodeChanging_FontSizePointsToPixels : INodeChangingCallback
{
void INodeChangingCallback.NodeInserted(NodeChangingArgs args)
{
// set back font size of every Run node from points to pixels
if (args.Node.NodeType == NodeType.Run)
{
Aspose.Words.Font font = ((Run)args.Node).Font;
font.Size = ConvertUtil.PointToPixel(font.Size);
}
}
void INodeChangingCallback.NodeInserting(NodeChangingArgs args)
{
// Do Nothing
}
void INodeChangingCallback.NodeRemoved(NodeChangingArgs args)
{
// Do Nothing
}
void INodeChangingCallback.NodeRemoving(NodeChangingArgs args)
{
// Do Nothing
}
}
Thank you, that workarround seems to work. But I have another sample HTML, if you see the word “Joo” have 11px in HTML and generated output have 11.5pt. Any other alternative to fix?
We have logged your requirement in our issue tracking system. Your ticket number is WORDSNET-22220. We will further look into the details of this requirement and will keep you updated on the status of the linked issue.
Regarding WORDSNET-22220, we have completed the analysis of this issue and concluded to close this issue with “not a bug” status. The analysis reveals the following:
Like MS Word, Aspose.Words stores font size in half points. When font size is loaded from HTML, it is converted from pixels to points and is rounded up to the nearest half point. For example: 14px => 14 / 96 * 72 = 10.5pt; 11px => 11 / 96 * 72 = 8.25, rounded to 8.5pt. This rounding is also performed when the font size is modified by setting the “Font.Size” value. That’s why 11px is converted and rounded up to 8.5pt on loading and then converted and rounded up to 11.5pt by the workaround code. This behavior is by design and it is the same what MS Word does.
Unfortunately, the only correct way to get the same font size as in HTML is to change units of “font-size” values in the HTML document from “px” to “pt” in order to get rid of all conversions and roundings. For example, by using a regex replace after HTML is loaded to a string in memory.
Thank you for your details and it makesense on what you are saying. Two things I need help here:
One is, as you mention 11px => 11 / 96 * 72 = 8.25, rounded to 8.5pt. But when I use 11px in my HTML it becomes 11.5 in output document. May I know how this conversion happend?
Second is, I would like to continue to use the callback function NodeChangingCallback on the document. But one issue I have seen is, if my HTML is already in points, like in attached example then below given code is still trying to convert. Meaning that if I have 11pt in my HTML, then output is shwoing as 14.5. How to avoid this?
void INodeChangingCallback.NodeInserted(NodeChangingArgs args)
{
// set back font size of every Run node from points to pixels
if (args.Node.NodeType == NodeType.Run)
{
Aspose.Words.Font font = ((Run)args.Node).Font;
font.Size = ConvertUtil.PointToPixel(font.Size);
}
}
When HTML is loaded to the document model:
11px => 11 / 96 * 72 = 8.25pt, rounded up to 8.5pt
After that, when the INodeChangingCallback.NodeInserted processes the text, ConvertUtil.PointToPixel does this:
8.5pt => 8.5 / 72 * 96 = 11.333px, which is then treated as 11.333pt and is rounded up to 11.5pt in the setter of Font.Size.
The only way is to somehow pass the information about the units of “font-size” values from HTML to INodeChangingCallback.NodeInserted and conditionally disable the conversion. When HTML is loaded by Aspose.Words, all font sizes are converted to points, so that information cannot be retrieved from the document model.
On the second one, tried to disable conversion by certain condition. No where I can see the source unit type (px/pt) to disable the condition. Is there any workarround to disable the conversion based on unit type?
I am afraid, the whole problem we are trying to solve (treat “px” in HTML as “pt”) looks like a hack, and there is no simple and logical solution. If you like the approach that uses INodeChangingCallback.NodeInserted, we would recommend to parse the source HTML using regular expressions, extract “font-size” values and check whether they are in “px” or “pt”. This information can then be passed to INodeChangingCallback.NodeInserted in order to disable the ConvertUtil.PointToPixel conversion in case font sizes are specified in “pt”.
If we were writing this code, however, we wouldn’t use INodeChangingCallback.NodeInserted. Instead, we would pre-process source HTML and replace “px” in “font-size” declarations with “pt” using regular expressions. Then we would let Aspose.Words load modified HTML normally.