Outline number and heading

We use Aspose.words for Java (3.3) to convert a Word
document to an HTML document, and then extract items from the HTML document by looking for headings and contents.

We want to to tell if a text in a heading is an outline number or a part of the heading text, but haven’t found a reliable way to do. The outline number could be like “1.1” or “III”.

For example:

1.1.1 Heading text

is converted to

<h3 style="margin:16pt 0pt 10pt; page-break-after:avoid; page-break-inside:avoid"> <span style="font-family:'Times New Roman'; font-size:12pt; font-weight:bold">1.1.1  </span> <span style="font-family:'Times New Roman'; font-size:12pt; font-weight:bold">Heading text</span></h3>

Is there any way to have an indication of whether a text in a heading is an outline number or part of the heading text when saving a document as an HTML file?

Thanks.

Hello
Thanks for your request. I’m afraid there is no way to indicate outline numbers of heading paragraph inside output HTML. But you can try retrieving first between <h1>…<h6> elements to get the numbers.
Also, please follow the link to learn details about how Aspose.Words saves document in the HTML/XHTML and MHTML formats:
https://docs.aspose.com/words/java/convert-a-document-to-html-mhtml-or-epub/
Best regards,

Hi

Thanks for your request. I think, it would be better to extract this information directly from MS Word document before converting it to HTML.
Best regards,