DOCX to HTML to Word Conversion again Creates Left Margin Indentations (C# .NET) | Export Round Trip Information

concord_tech · September 18, 2020, 7:26pm

Hello,
I have a docx document just.zip (19.5 KB): when I convert it to HTML and then reconvert the HTML to docx again, I got a lot of harmful indentation near the list items (different from the original doc)
Here is the code used to do the double conversion (docx -> html, html -> docx):

public class ConvertDocumentToHtmlWithRoundtrip {
    private static final String LICENSE = "Aspose.Words.lic";
    public static final String INPUT = "%s.docx";
    public static final String OUTPUT = "%s-out.docx";
    public static final String HTML = "%s.html";

    public static void main(String[] args) throws Exception {
        //ExStart:ConvertDocumentToHtmlWithRoundtrip
        // The path to the documents directory.
        String dataDir = Utils.getDataDir(ConvertDocumentToHtmlWithRoundtrip.class);
        String name = "just";

        // Load the document.
        Document doc = new Document(dataDir + String.format(INPUT, name));

        HtmlSaveOptions options = new HtmlSaveOptions();
        options.setExportRoundtripInformation(true);
        options.setExportListLabels(ExportListLabels.BY_HTML_TAGS);

        doc.save(dataDir + String.format(HTML, name), options);

        doc = new Document(dataDir + String.format(HTML, name));

        //Save the document Docx file format
        doc.save(dataDir + String.format(OUTPUT, name), SaveFormat.DOCX);
        //ExEnd:ConvertDocumentToHtmlWithRoundtrip
        System.out.println("Document converted to html with roundtrip informations successfully.");
    }
}

awais.hafeez · September 19, 2020, 8:40am

@mohamed_hamdi,

For the sake of any corrections in Aspose.Words for Java API, we have logged this problem in our issue tracking system with ID WORDSNET-21111. We will further look into the details of this problem and will keep you updated on the status of linked issue. We apologize for your inconvenience.

concord_tech · November 6, 2020, 10:24am

the problem is solved by 20.9 version, you could close it if you want

awais.hafeez · November 6, 2020, 3:08pm

@mohamed_hamdi,

Unfortunately, WORDSNET-21111 is not resolved yet. We have completed the analysis of this issue and the root cause has been identified. We will keep you posted here on any further updates and notify you when this issue will get resolved in future. We apologize for any inconvenience.

Can you please also create a comparison screenshot which highlights the problematic areas in Aspose.Words generated DOCX (with respect to original document) and attach it here for our reference. We will then investigate the issue further on our end and provide you more information.