Auto spacing for div and list elements seems incorrect

I am looking at using Aspose.Words to perform an HTML to Word conversion and some of the paragraph settings don’t seem correct.
If you import the following html, you get extra space between each of the lines except between Item 1, Item 2 and Item 3.

Line 1

Line 2

Line 3

  • Item 1

  • Item 2

  • Item 3

Line 4

Line 5

Line 6

If you examine the paragraph formatting, each of the paragraphs representing the div elements has the “Space After Auto” attribute set to true (which inserts whitespace after the paragraph). I believe this is incorrect (it would be correct if these were p elements). If you open the html in a web browser, you won’t see any extra whitespace between the divs. Furthermore, to match the behavior in a browser, the last element of the list has the “Space After Auto” set to true. I believe that the first element in the list should have “Space Before Auto” set to true instead of having the preceeding element (in this case a div) having the after auto set.
After the import, I’m not sure how to correct this. I can set the after auto values to false for all of the paragraphs, but then that will eliminate the spacing that should appear before and after the list. Any suggestions?
I’m using Aspose.Words for Java 2.4.2.

Hi
Thanks for your inquiry. You can try using the following code to resolve that.

// Open Html document
Document doc = new Document("test.html");
// Get collection of paragraphs
NodeCollection paragraphs = doc.getChildNodes(NodeType.PARAGRAPH, true);
// Loop through all paragraphs
for (int parIndex = 0; parIndex < paragraphs.getCount(); parIndex++)
{
    Paragraph par = (Paragraph)paragraphs.get(parIndex);
    // Set Space after auto if current paragraph is list item
    if (par.isListItem())
    {
        par.getParagraphFormat().setSpaceAfterAuto(true);
        par.getParagraphFormat().setSpaceBeforeAuto(true);
    }
    // Otherwise set it false
    else
    {
        par.getParagraphFormat().setSpaceAfterAuto(false);
        par.getParagraphFormat().setSpaceBeforeAuto(false);
        par.getParagraphFormat().setSpaceAfter(0);
        par.getParagraphFormat().setSpaceBefore(0);
    }
}
// Save output document
doc.save("out.doc");

Hope this helps.
Best regards.

Thank you for your reply. I did manage to figure that code out on my own, but I would still recommend that you look at changing the behavior if you want a higher fidelity conversion from HTML.

Hi
Thanks for your recommendation. Actually we work on HTML import/export improvements in .NET base line. Then these improvements will be ported to java version.
Best regards.