Free Support Forum - aspose.com

Insert HTML in Word Document using C# .NET | Transform Font Size Styles from Pixel to Points

Hello,

We were using Aspose.Words component, where we insert the HTML into our word template. The issue we are seeing is, let say the HTML which I’m going to insert is having font-size in pixel format, but generated document showing less font size visually.

What I understood is that, 12px font-size in HTML is not equal to 12pt in word document (rather is shows arround 8pt). What is the better way to fix that with out changing HTML font-sizes?

@srinudhulipalla,

Please ZIP and attach the following resources here for testing:

  • Your simplified source Word document
  • The HTML string that you want to insert in above template Word document
  • Aspose.Words for .NET 21.4 generated output DOCX file showing the undesired behavior
  • Your expected DOCX file showing the desired output. You can create this document manually by using MS Word.

As soon as you get these pieces of information ready, we will start further investigation into your scenario and provide you more information.

Here are the supporting files to replicate the problem from your side.

Attached files:
Program.zip (1.0 KB)
Expected.zip (37.1 KB)

I have also created Expected.docx file manually. As you can see Output.docx is created with font 10.5 Arial, but expecting 14 Arial.

How do we fix at Aspose without altering our HTML code?

@srinudhulipalla,

But, MS Word 2019 also produces a similar output when saving attached HTML file as DOCX format and Aspose.Words 21.4 tries to mimic the behavior of MS Word.

So, this is an expected behavior of Aspose.Words.

Hi,

Didn’t know whether you have noticed the issue. The actual issue is related to font-size. I have clearly conveyed in the below attached image. Hope this clarify the issue:

image.png (39.3 KB)

Is there anyway to fix it in Aspose.Words?

@srinudhulipalla,

Please check if the following workaround is acceptable for you?

string htmlString = File.ReadAllText("C:\\Temp\\expected\\input_html.html");

Document oDoc = new Document("C:\\Temp\\expected\\input.docx");

oDoc.NodeChangingCallback = new HandleNodeChanging_FontSizePointsToPixels();

FindReplaceOptions options = new FindReplaceOptions();
options.Direction = FindReplaceDirection.Forward;
options.MatchCase = false;
options.FindWholeWordsOnly = true;
options.ReplacingCallback = new WordDocReplaceHandler();

oDoc.Range.Replace("Hello", htmlString, options);

oDoc.Save("C:\\temp\\expected\\21.4.docx");

public class HandleNodeChanging_FontSizePointsToPixels : INodeChangingCallback
{
    void INodeChangingCallback.NodeInserted(NodeChangingArgs args)
    {
        // set back font size of every Run node from points to pixels
        if (args.Node.NodeType == NodeType.Run)
        {
            Aspose.Words.Font font = ((Run)args.Node).Font;
            font.Size = ConvertUtil.PointToPixel(font.Size);
        }
    }

    void INodeChangingCallback.NodeInserting(NodeChangingArgs args)
    {
        // Do Nothing
    }

    void INodeChangingCallback.NodeRemoved(NodeChangingArgs args)
    {
        // Do Nothing
    }

    void INodeChangingCallback.NodeRemoving(NodeChangingArgs args)
    {
        // Do Nothing
    }
}

internal class WordDocReplaceHandler : IReplacingCallback
{
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        try
        {
            Regex regHTML = new Regex(@"<\s*([^ >]+)[^>]*>.*?<\s*/\s*\1\s*>");
            bool isHTML = regHTML.IsMatch(e.Replacement);

            DocumentBuilder builder = new DocumentBuilder((Document)e.MatchNode.Document);

            if (e.MatchNode.GetText().ToLower().Contains("mergefield"))
            {
                if (isHTML)
                {
                    builder.MoveToMergeField(e.Match.Value);
                    builder.InsertHtml(e.Replacement, true);
                }

                return ReplaceAction.Skip;
            }
            else
            {
                if (isHTML)
                {
                    builder.MoveTo(e.MatchNode);
                    builder.InsertHtml(e.Replacement);
                    e.Replacement = string.Empty;
                }

                return ReplaceAction.Replace;
            }
        }
        catch (Exception)
        {
            return ReplaceAction.Replace;
        }
    }
}

Thank you, that workarround seems to work. But I have another sample HTML, if you see the word “Joo” have 11px in HTML and generated output have 11.5pt. Any other alternative to fix?

input_html.zip (4.5 KB)

@srinudhulipalla,

We have logged your requirement in our issue tracking system. Your ticket number is WORDSNET-22220. We will further look into the details of this requirement and will keep you updated on the status of the linked issue.

@srinudhulipalla,

Regarding WORDSNET-22220, we have completed the analysis of this issue and concluded to close this issue with “not a bug” status. The analysis reveals the following:

Like MS Word, Aspose.Words stores font size in half points. When font size is loaded from HTML, it is converted from pixels to points and is rounded up to the nearest half point. For example: 14px => 14 / 96 * 72 = 10.5pt; 11px => 11 / 96 * 72 = 8.25, rounded to 8.5pt. This rounding is also performed when the font size is modified by setting the “Font.Size” value. That’s why 11px is converted and rounded up to 8.5pt on loading and then converted and rounded up to 11.5pt by the workaround code. This behavior is by design and it is the same what MS Word does.

Unfortunately, the only correct way to get the same font size as in HTML is to change units of “font-size” values in the HTML document from “px” to “pt” in order to get rid of all conversions and roundings. For example, by using a regex replace after HTML is loaded to a string in memory.