Footnotes conversion from HTML to DOCX

Hello, I am trying to convert an HTML to DOCX with Aspose.Words for java and I have footnotes in the HTML with reference. I want to convert them in real footnotes in DOCX without loosing the reference is it possible?
Thank you in advance!

@Thomai_Chelidoni

Converting HTML footnotes to real Word footnotes with Aspose.Words for Java

Yes – Aspose.Words can import footnotes that are marked up in the HTML source and turn them into native Word footnotes when you save the document as DOCX.
The key is to use the HTML pattern that Aspose.Words recognises.


1. Required HTML structure

HTML part What Aspose.Words expects
Footnote reference <sup><a href="#_ftn1">1</a></sup> – the reference must be inside a <sup> element and the <a> href must point to the footnote body’s id.
Footnote body <div class="footnote" id="_ftn1">…footnote text…</div> – a container (<div> or <section>) with the same id as the reference and any class (commonly footnote). The inner markup can be normal HTML (paragraphs, bold, etc.).

Example

<p>This is some text with a footnote reference<sup><a href="#_ftn1">1</a></sup>.</p>

<div class="footnote" id="_ftn1">
    <p>This is the footnote text that will become a real Word footnote.</p>
</div>

When the above pattern is present, Aspose.Words automatically creates a Footnote node in the Word document.


2. Java code – load HTML and save as DOCX

import com.aspose.words.*;

import java.io.*;

public class HtmlToDocxWithFootnotes {
    public static void main(String[] args) throws Exception {

        // 1️⃣  Load the HTML (the HTML string may be read from a file, DB, etc.)
        String html = new String(Files.readAllBytes(Paths.get("input.html")), StandardCharsets.UTF_8);

        // 2️⃣  Create load options (no special options are required for footnotes)
        HtmlLoadOptions loadOptions = new HtmlLoadOptions();

        // 3️⃣  Load the document
        Document doc = new Document(new ByteArrayInputStream(html.getBytes(StandardCharsets.UTF_8)), loadOptions);

        // 4️⃣  Save as DOCX – footnotes are now real Word footnotes
        doc.save("output.docx", SaveFormat.DOCX);
    }
}

What happens under the hood

  • The <sup><a …></a></sup> element becomes a FootnoteReference field.
  • The <div id="_ftn1">…</div> content becomes the Footnote node.
  • The reference and the footnote are linked automatically, so the numbering stays consistent.

3. Important notes

  • Only the pattern shown above is recognised. If your HTML uses a different markup (e.g., plain superscript numbers without an <a> link), Aspose.Words will treat them as regular text.
  • Multiple footnotes – just repeat the pattern with unique ids (_ftn2, _ftn3, …).
  • Styling – any CSS applied inside the footnote container is imported as Word formatting (fonts, bold, italics, etc.).
  • Complex footnote structures (tables, images) are also supported as long as they are placed inside the footnote container.

4. If the HTML cannot be changed

When you cannot modify the source HTML to match the required pattern, you can:

  1. Pre‑process the HTML (e.g., with Jsoup) to wrap each footnote reference in <sup><a …></a></sup> and create the corresponding footnote containers.
  2. Add footnotes programmatically after loading the HTML:
Document doc = new Document("input.html");

// Example: add a footnote at the end of a paragraph
Paragraph para = (Paragraph) doc.getFirstSection().getBody().getFirstParagraph();
Footnote footnote = para.appendFootnote(FootnoteType.FOOTNOTE, "Added footnote text");

5. References


TL;DR

Structure your HTML footnotes as <sup><a href="#_ftnX">N</a></sup> for the reference and <div id="_ftnX">…</div> for the footnote text.
Aspose.Words will import them automatically as native Word footnotes when you load the HTML and save it as DOCX.

Feel free to share a snippet of your current HTML if you need help adapting it to the required format!

Thank you for your quick answer, i have the html below that i need to handle:

<div class="footnotes">
    <p id="fn-d15116e167">
        <sup>
            <a href="#fnbck-d15116e167">1</a>
        </sup>
        <sup>
        </sup>  Consiglio federale (2022): <i>Rafforzamento della piazza economica svizzera – </i>
        <i>Analisi globale del</i>
        <i> Consiglio federale del 16&nbsp;febbraio&nbsp;2022, </i>disponibile su www.seco.admin.ch/it &gt; Documentazione &gt; Comunicati stampa &gt; Il Consiglio federale si adopera per rafforzare la piazza economica svizzera.
    </p>
</div>

Can you adapt it please?
Also in the DOCX the footnote i would like to be in the end of the page outside of the main text, with the tests that i tried it’s in the middle of the page between paragraphs

@Thomai_Chelidoni Aspose.Words supports preserving footnotes after DOCX->HTML->DOCX roundtrip. So you can put your footnotes the same way as Aspose.Words does when concerting DOCX to HTML:
in.docx (17.1 KB)
out.zip (601 Bytes)
out.docx (8.6 KB)

[Test]
public void Test001()
{
    Document doc = new Document(@"C:\Temp\in.docx");
    doc.Save(@"C:\Temp\out.html", new HtmlSaveOptions() { PrettyFormat = true });
}

[Test]
public void Test002()
{
    Document doc = new Document(@"C:\Temp\out.html");
    doc.Save(@"C:\Temp\out.docx");
}

I have tried but unfortunately i have still the same issue, the footnote is in the middle of the page when i convert the html to docx:


Maybe i need to manage it through java code? The footnote html that i’m trying to convert is this one:

<div class="footnotes">
    <p id="fn-d15116e167">
        <sup>
            <a href="#fnbck-d15116e167">1</a>
        </sup>
        <sup>
        </sup>  Consiglio federale (2022): <i>Rafforzamento della piazza economica svizzera – </i>
        <i>Analisi globale del</i>
        <i> Consiglio federale del 16&nbsp;febbraio&nbsp;2022, </i>disponibile su www.seco.admin.ch/it &gt; Documentazione &gt; Comunicati stampa &gt; Il Consiglio federale si adopera per rafforzare la piazza economica svizzera.
    </p>
</div>

Thank you again for your reply!

@Thomai_Chelidoni HTML structure, that is recognized as footnote looks like this:

<div>
	<p style="margin-top:0pt; margin-bottom:8pt">
		<span>Test footnote</span><a name="_ftnref1"></a><a href="#_ftn1" style="text-decoration:none"><span style="font-size:8pt; vertical-align:super; color:#000000">[1]</span></a>
	</p>
</div>
<hr style="width:33%; height:1px; text-align:left; -aw-footnote-numberstyle:0; -aw-footnote-startnumber:1; -aw-footnote-type:0" />
<div id="_ftn1" style="-aw-footnote-isauto:1">
	<p style="margin-top:0pt; margin-bottom:0pt; line-height:normal; font-size:10pt">
		<a href="#_ftnref1" style="text-decoration:none"><span style="font-size:6.67pt; vertical-align:super; color:#000000">[1]</span></a><span> footnote</span>
	</p>
</div>

As you can see Aspose.Words uses special roundtrip style attributes -aw...... These attributes allows Aspose.Words to properly recognize content as footnote.

In the example that you share we are loosing the reference between the footnote and the main text is there any way to solve this?
Thank you in advance!

@Thomai_Chelidoni As I can see the link works fine in the output document attached above.