Converting HTML to Docx

I can’t get ASPOSE to pick up any HTML formatting when creating the document.

I am trying very basic html styles. I am using the document builder, InsertHtml method.

DocumentBuilder builder = new DocumentBuilder(doc);
builder.InsertHtml("<h1>This Is A Header</h1>");

I’ve tried putting the styles in a class within a style tag embedded in the html as well. I can’t get a single style to show up within the docx?

Hi Coty,

Thanks
for your inquiry. Perhaps, you are using an older version of Aspose.Words; as with Aspose.Words v15.5.0, I am unable to reproduce this problem on my side. I would suggest you please upgrade to the latest version of Aspose.Words i.e. v15.5.0 and let us know how it goes on your side.

Moreover, we introduced a
new overload of DocumentBuilder.InsertHtml method which allows you to
choose what formatting will be used as a base for inserted HTML
fragments.

The new
overload has an argument useBuilderFormatting which when is false,
formatting specified in DocumentBuilder is ignored, and formatting of
inserted text is based on default HTML formatting. In this case,
inserted text looks as in browsers.

When
useBuilderFormatting is true, formatting of inserted text is based on
formatting specified in DocumentBuilder. Note that useBuilderFormatting
chooses only base formatting of inserted text, and do not affect
formatting directly specified in the HTML fragment.

The following example illustrates the difference between the two modes:

DocumentBuilder builder = new DocumentBuilder();
builder.ParagraphFormat.LeftIndent = 72;
builder.Font.Name = "Arial";
builder.Font.Size = 24;
bool useBuilderFormatting = …
builder.InsertHtml("<b>Text</b>", useBuilderFormatting);

In
this example, if useBuilderFormatting is false, the inserted paragraph
will have no left indent and will use the ‘Times New Roman’ 12pt font,
which is the default HTML font and indent. If useBuilderFormatting is
true, the inserted paragraph will be indented by 1 inch (72 points) and
will use the ‘Arial’ 24pt font, as specified in DocumentBuilder.
However, in both cases the inserted text will be bold and red, as
specified in the HTML fragment.

Please check following code example for your kind reference. Hope this helps you.

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
string csshtml = ".MyStyle{color: red;}
</body>";
builder.InsertHtml(csshtml);
string divhtml = "Div Text";
builder.InsertHtml(divhtml, true);
doc.Save(MyDir + "Out.docx");

I updated to the latest version and used all inline styles for my word doc - Works really well! Sometimes it doesn’t seem to pick up margins very well when placed on divs, I haven’t nailed down exact scenarios, but using html tables fixed my margin issues. Thank You!

Hi Coty,

Thanks
for your inquiry. Please insert the html fragment along with CSS into the document as shown below.

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
string csshtml = ".MyStyle{color: red;}Some text";
builder.InsertHtml(csshtml);
doc.Save(MyDir + "Out.docx");

If you still face problem, please share following detail for investigation purposes.

  • Please attach your input Word and html documents.

  • Please create a standalone/runnable simple application (for example a Console
    Application Project
    ) that demonstrates the code (Aspose.Words code) you used to generate your output document

  • Please attach the output Word file that shows the undesired behavior.

  • Please attach your target Word document showing the desired behavior. You can
    use Microsoft Word to create your target Word document. I will investigate as to how you are expecting your final document be generated like.