Retain Table Column Widths during Word DOT Document to HTML Conversion using C# .NET

Hi Hafeez

I have one more issue with tables.
After conversion, Table width is properly maintained but column inside column size are not proper so we are facing issue with it.
Could you please check why columns inside table are not proper?

with regards
ManjunathIssue.png (42.7 KB)
SampleTable.zip (64.1 KB)

@manjunath.patil,

After an initial test with the licensed latest (20.10) version of Aspose.Words for .NET, we were unable to reproduce this issue on our end. We used the following simple code to convert “SampleTable.dot” file to HTML format on our end:

See output HTML: 20.10 html.zip (1.1 KB)

Document doc = new Document("C:\\SampleTable\\SampleTable.dot");
doc.Save("C:\\SampleTable\\20.10.html");

We suggest you to please upgrade to the latest version of Aspose.Words. In case the problem still remains, then please provide the exact code snippet that we can use on our end to reproduce the same problem.

Dear Hafeez

I also tried with same code,
But later i am bit post processing the generated HTML so that converted HTML looks similar to word.
Example For the attached document i do the following:
viewerwdith = ConvertUtil.PointToPixel(Pagewidth-leftmargin-rightmargin)
Then set viewerwidth as width to the div block in the body of html.
Example style=“width:629px” in the attached HTML.

If whatever i am trying is wrong then please suggest me what is the proper way to have converted HTML to look similar to word(similar in size of table width as I attached the HTML page)

with regards
Manjunath

@manjunath.patil,

Aspose.Words tries to mimic the behavior of MS Word. Alternatively, you can try saving to HTML FIXED format by using the following code:

Document doc = new Document("E:\\temp\\Input File.docx");
HtmlFixedSaveOptions htmlFixedSaveOptions = new HtmlFixedSaveOptions();
htmlFixedSaveOptions.PrettyFormat = true;
htmlFixedSaveOptions.ExportEmbeddedImages = true;
htmlFixedSaveOptions.ExportEmbeddedCss = true;
htmlFixedSaveOptions.ExportEmbeddedFonts = true;
doc.Save(@"E:\Temp\output-HtmlFixed.html", htmlFixedSaveOptions);

Dear Hafeez

Thanks for the code, it really worked well w.r.t conversion, with this code I am able to convert table exactly but It started giving other issue. I am not sure weather we can fix it.
We have tags inside table (Example: “<%_abcde.pc9081%>” )which are place holders and will be replaced with actual data at runtime. But with this code when converted our tags are broken into two spans.

Example in converted HTML see :
“<span class=“awspan awtext002” style=“font-size:9pt; left:0pt; top:10.64pt;”><%_abcde.pc</span>
<span class=“awspan awtext002” style=“font-size:9pt; left:0pt; top:20.99pt;”>9081%></span>”

I tried to merge two spans as part of post processing but i am not able to do “word wrap”, after merging text is going beyond table cell.

I merged them as :
“<span class=“awspan awtext002” style=“font-size:9pt; left:0pt; top:10.64pt;”><%_abcde.p9081%></span>”
So could you please suggest on this issue?

with regards
Manjunath

@manjunath.patil,

Please try Document.JoinRunsWithSameFormatting Method and then convert to HTML Fixed file format. In case the problem still remains, then please ZIP and upload your simplified Word document and Aspose.Words generated HTML file showing the undesired behavior here for testing. We will then investigate the issue on our end and provide you more information.

Dear Hafeez

Could you please give code snippet of how and when to call this Document.JoinRunsWithSameFormatting?

with regards
Manjunath

@manjunath.patil,

You can simply call this method after creating Document instance:

Document doc = new Document("E:\\temp\\Input File.docx");
doc.JoinRunsWithSameFormatting()
HtmlFixedSaveOptions htmlFixedSaveOptions = new HtmlFixedSaveOptions();
htmlFixedSaveOptions.PrettyFormat = true;
htmlFixedSaveOptions.ExportEmbeddedImages = true;
htmlFixedSaveOptions.ExportEmbeddedCss = true;
htmlFixedSaveOptions.ExportEmbeddedFonts = true;
doc.Save(@"E:\Temp\output-HtmlFixed.html", htmlFixedSaveOptions);

Hi Hafeez

I dont see any change by adding doc.JoinRunsWithSameFormatting();
Please try with the already uploaded SampleTable.dot file.

with regards
Manjunath

@manjunath.patil,

For the sake of any correction in Aspose.Words API, we have logged this problem in our issue tracking system with ID WORDSNET-21309. We will further look into the details of this problem and will keep you updated on the status of linked issue. We apologize for your inconvenience.

1 Like

@manjunath.patil,

The best way will be avoiding line breaks in macros words. In case of “HtmlFixed”, joining spans with different properties is not possible. We suggest you please try the following code:

var doc = new Document(@"in.dot");
doc.FirstSection.Body.Tables[0].AllowAutoFit = true;
doc.FirstSection.Body.Tables[0].LeftIndent = -30;
doc.UpdateTableLayout();
doc.Save(@"out.html", SaveFormat.HtmlFixed);

Dear Hafeez

Here based on what are you saying -30 as indent?
Also if I save it as HtmlFixed then output HTML will have few issues as below:

  1. HTML not be saved as Indented, so it will not be easy to read without indent.
  2. HTML page shows page like view with border

I have attached both expected and error htmls. In expected is extract Body, header and footer into separate files and set proper width to body of each html pages.Sample.zip (41.1 KB)

Also Aspose add meta tag "<meta name="generator" content="Aspose.Words for .NET 20.10.0" />" can we save without this tag?

with regards
Manjunath

@manjunath.patil,

Please take a look at HtmlFixedSaveOptions Class to specify additional options when saving a document into the HtmlFixed format. For example, to remove page borders, please use the following code:

Document doc = new Document("C:\\temp\\SampleTable\\SampleTable.dot");

doc.FirstSection.Body.Tables[0].AllowAutoFit = true;
doc.FirstSection.Body.Tables[0].LeftIndent = -30;
doc.UpdateTableLayout();

HtmlFixedSaveOptions htmlFixedSaveOptions = new HtmlFixedSaveOptions();
htmlFixedSaveOptions.ShowPageBorder = false;

doc.Save(@"C:\Temp\SampleTable\\20.10.html", htmlFixedSaveOptions);

I am afraid, there is no way to remove this ‘generator name’ from output HTML by using Aspose.Words API.

Dear Hafeez

With HtmlFixedSaveOptions I am unable to save Header, Footer and body into separate HTML files.

with regards
Manjunath

@manjunath.patil,

For example, please see these Word and HTML Fixed format files (extract headers.zip (25.1 KB)) and try running the following code to extract contents of headers from DOCX and put them in separate HTML file:

Document doc = new Document("C:\\Temp\\header footer.docx");

Document header_Footer_Document = (Document)doc.Clone(false);
header_Footer_Document.RemoveAllChildren();
header_Footer_Document.EnsureMinimum();
DocumentBuilder documentBuilder = new DocumentBuilder(header_Footer_Document);

foreach (Section sec in doc.Sections)
{
    foreach (HeaderFooter headerFooter in sec.HeadersFooters)
    {
        foreach (Node node in headerFooter.ChildNodes)
            header_Footer_Document.LastSection.Body.AppendChild(header_Footer_Document.ImportNode(node, true));
    }

    documentBuilder.MoveToDocumentEnd();
    documentBuilder.InsertBreak(BreakType.SectionBreakNewPage);
}

HtmlFixedSaveOptions htmlFixedSaveOptions = new HtmlFixedSaveOptions();
htmlFixedSaveOptions.PrettyFormat = true;
htmlFixedSaveOptions.ShowPageBorder = false;
htmlFixedSaveOptions.ExportEmbeddedImages = true;
htmlFixedSaveOptions.ExportEmbeddedCss = true;
htmlFixedSaveOptions.ExportEmbeddedFonts = true;

header_Footer_Document.Save(@"C:\temp\html fixed output.html", htmlFixedSaveOptions);

Hi

Yes header and footer can be done but when I save body part by removing header footer data using the below code:
foreach (Section sec in aDocument.Sections)
{
sec.HeadersFooters.Clear();
}

And Saving using HTMLfixedsaveoptions i am getting header and footer as well.

@manjunath.patil,

Please ZIP and upload your simplified Word document (you are getting this problem with) here for testing. We will then investigate the issue on our end and provide you more information.

Dear Hafeez

I am facing issues with table, column widths.
I am attaching the input file, output HTML file and also screenshot showing changes.
Code i use to convert is simple as shown below:
Document myDocument = new Document(inputFile);
HtmlSaveOptions options = new HtmlSaveOptions
{
PrettyFormat = true,
HtmlVersion = HtmlVersion.Html5,
SaveFormat = SaveFormat.Html
};
aDocument.Save(outPutFile, options);Test Sample.zip (156.0 KB)

Please check and fix the issue as soon as possible. We have evaluated your library and we also procured two license, but we are facing issue with width, Please let us know how can we fix at the earliest.

@manjunath.patil,

We have logged this problem in our issue tracking system with ID WORDSNET-21480. We will further look into the details of this problem and will keep you updated on the status of correction. We apologize for your inconvenience.