Free Support Forum - aspose.com

Saving Word document as HTML: header and footer class attributes are not consistent

Aspose Words version info from manifest.mf file:

Manifest-Version: 1.0
Specification-Title: Aspose.Words for Java
Implementation-Title: Aspose.Words for Java
Specification-Version: 2.6.1
Implementation-Version: 2.6.1
Specification-Vendor: Aspose Pty Ltd
Implementation-Vendor: Aspose Pty Ltd
Copyright: Copyright 2003-2008 Aspose Pty Ltd
Given a document that has a page header in ‘normal’ style – see the attached “normal header.do” file – and saving it as HTML produces this snippet:

<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”><<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>p<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: red; background: none repeat scroll 0% 0% white;” lang=“EN-AU”> class<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>="<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: black; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>Header<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>"><<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>span<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: red; background: none repeat scroll 0% 0% white;” lang=“EN-AU”> style<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>="<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: black; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>color:#000000;
<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>"><span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: black; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>HEADER<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”></<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>span<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>></<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>p<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>><span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue;” lang=“EN-AU”><o:p></o:p>


Now if you apply some style on the header text, let’s just stay Title from the Style bar, the HTML output is changed to:

<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”><<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>p<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: red; background: none repeat scroll 0% 0% white;” lang=“EN-AU”> class<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>="<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: black; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>Title<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>"><<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>span<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>><span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: black; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>HEADER<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”></<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>span<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>></<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>p<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>><span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue;” lang=“EN-AU”><o:p></o:p>


Notice the “class” attribute on

has changed from Header to Title.

Is this the desired behavior? If so, what would be a way to tell a header

elements in the document.

We have an XSLT process that expects header

s to have a class attribute with value set to “Header”. That, of course, fails with this change of style.

If this indeed is the desired behavior, we’ll suggest to introduce new attribute, or add another class to the class attribute to distinguish

s that are either header or footer. Just like:

.

Thanks.

ATTA

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your request. Class name is output HTML is the same as style name in MS Word document. Please see the attached screenshot. So if you need that all paragraphs in the header have “Header” class, you should specify this style in the source document.

Best regards.

Thanks, Alexey. That makes sense.

I’ll take this opportunity to log a enhancement request:

In MS World to HTML export, provide a way to preserve the semantics of the page header and footer. That is, a consistent way to identify header and footer elements in the exported (X)HTML document.

This could be as simple as warping the current header/footer

s into a

with an ID or class attribute that denotes the contents as header/footer.

<<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>div class<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>=“pageheader<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>” id=“header_1”>

<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”><<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>p<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: red; background: none repeat scroll 0% 0% white;” lang=“EN-AU”> class<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>="<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: black; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>Header<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>"><<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>span<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: red; background: none repeat scroll 0% 0% white;” lang=“EN-AU”> style<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>="<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: black; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>color:#000000;
<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>"><span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: black; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>HEADER<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”></<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>span<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>></<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: maroon; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>p<span style=“font-size: 12pt; font-family: “Times New Roman”,“serif”; color: blue; background: none repeat scroll 0% 0% white;” lang=“EN-AU”>>


I hope it makes sense and am sure it would be very useful where people wanted to run the produced XHTML document thru some XSLT post processing.

ATTA

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for your suggestion. We will consider adding such option in one of future versions of Aspose.Words. Your request has been linked to the appropriate issue. You will be notified as soon as it is resolved.

Best regards.