DOCX to HTML Conversion Issue with List Item Indentation using Java

I have a document containing a list with three levels
I have applied a right indentation for this list.
the word document: Bug.zip (6.5 KB)

Once it is converted to HTML, the right indentation is applied one time for the root level, two times for the second level and three times for the third level and I got a pyramid like structure.

Java code converting the docx file to an HTML:

    String dataDir = Utils.getDataDir(ConvertDocumentToHtmlWithRoundtrip.class);
    String name = "Bug";

    // Load the document.
    Document doc = new Document(dataDir + String.format(INPUT, name));

    HtmlSaveOptions options = new HtmlSaveOptions();
    options.setExportListLabels(ExportListLabels.BY_HTML_TAGS);
    options.setExportPageMargins(true);
    options.setExportPageSetup(true);
    doc.save(dataDir + String.format(HTML, name), options);

I have made some analysis, I hope that could help:

The problem is related to differences between OpenXml format and HTML format:

  • For OpenXml: a list is not a nested structure => a sublist is not a child of a list instead they both reference the same list definition.
    Each item is a paragraph item <w:p> referencing a list definition and having a level attributes indicating the item level in the list:
    A root item is a paragraph referencing a list definition and has 1 as level.
    A sub-item is a paragraph referencing the same list definition and has 2 as level, etc.
    So there is no inheritance between items from different levels, and if we apply a right indentation for each one it won’t have any cumulative effect

  • For HTML it is different, a list is a nested structure, each sub-list is a child of a list and inherits from it. If, for example, we define a right indentation for the root and a right indentation for the child, the final indentation for the child will be the sum of the two indentations.

@mohamed_hamdi

We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-21369. You will be notified via this forum thread once this issue is resolved. We apologize for your inconvenience.

To make sure that we are on the same page, please ZIP and attach your problematic and expected output HTML here for our reference. We will fix this issue according to your requirement.

@tahir.manzoor thanks for your quick answer
Attached you find a zip with two html files:
1- current.html --> the one generated currently by Aspose
2- expected.html --> the one I expect

HTML.zip (1.8 KB)

@mohamed_hamdi

We have not found the right indentation issue with both documents. Could you please share the screenshots of problematic sections of output document? Please also share the browser name that you are using.

I have attached a zip with two screenshots:
screenshot.zip (132.4 KB)

  • current.png is the screenshot for the current behavior (apply the right indentation multiple times)
  • fixed.png is the screenshot of what I expect

the browser is firefox, but it is something cross-browser, not really related to firefox

hope that could help :slight_smile:

@mohamed_hamdi

Thanks for sharing the detail. We have logged it in our issue tracking system. We will inform you via this forum thread once there is an update available on this issue.