Shape Alignment not working when converting HTML to DOCX format

Hi Team,

We have a use-case where the HTML contains an SVG image and after the SVG is a Text DIV Element. When we pass this HTML to the ASPOSE Words and save the output in DOCX format, the SVG Image and then text is supposed to be vertically Middle Aligned but upon opening the document it looks Top Middle Aligned. We noticed that when we try to update the Image Layout Position properties from within the Word Document, the fields are disabled by default.

We use the CSS property vertical-align: middle and display: table-cell in the HTML for vertically middle aligning the contents.

Sample ASPOSE conversion logic: Assume htmlBytes is the content of this file “Sample Image 1.html” which is attached below.

HtmlLoadOptions options = new HtmlLoadOptions();
Document doc = new Document(htmlBytes, options);
DocumentBuilder builder = new DocumentBuilder(doc);
PageSetup pageSetup = builder.getPageSetup();

double margin = 0;
pageSetup.setLeftMargin(margin);
pageSetup.setRightMargin(margin);
pageSetup.setTopMargin(margin);
pageSetup.setBottomMargin(margin);

Section section = doc.getFirstSection();
Body body = section.getBody();
for (Node node: children) {
    int nodeType = node.getNodeType();
    // Body will have 1st child as Paragraph or Table when inserting
    // HTML content into Document.                
    switch (nodeType) {
        case NodeType.PARAGRAPH:
            Paragraph para = Paragraph.class.cast(node);
            NodeCollection < Shape > shapes = para.getChildNodes(NodeType.SHAPE, true);
            if (shapes.getCount() == 1) {
                Shape shape = Shape.class.cast(shapes.get(0));
                // Expecting Shape to be an Image
                if (shape.hasImage()) {
                    shape.setWrapType(WrapType.INLINE);
                    shape.isLayoutInCell(false);
                    shape.setRelativeHorizontalPosition(RelativeHorizontalPosition.MARGIN);
                    shape.setHorizontalAlignment(HorizontalAlignment.CENTER);
                    shape.setRelativeVerticalPosition(RelativeVerticalPosition.MARGIN);
                    shape.setVerticalAlignment(VerticalAlignment.CENTER);
                }
            }
            break;
        default:
            break;
    }
}
OoxmlSaveOptions ooxmlSaveOptions = new OoxmlSaveOptions();
doc.save("Sample Image 1.docx", options);

I have attached the input and output files used in this conversion and also attached the screenshot of the layout position disabled properties for your reference.

Could you suggest what is the right way to update the Shape Alignment?

File Attachments: Sample Image 1.zip (52.7 KB)

@oraspose This is the expected behavior of Aspose.Words. You should note that Aspose.Words is designed to work with MS Word documents. There is no analog of DIV elements in MS Word documents, so the DIVs are converted to paragraphs in Aspose.Words DOM. In this case Aspose.Words behaves the same way as MS Word does.
ms_word.docx (12.0 KB)
aspose.docx (9.0 KB)

Is there a way to center align the Shape Vertically and Horizontally? Given that the Shape Alignment logic did not work.

@oraspose Sure, you can specify vertical and horizontal alignment of the shape:

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);

// Insert some shape
Shape shape = builder.insertShape(ShapeType.RECTANGLE, 100, 100);
shape.setWrapType(WrapType.NONE);
shape.setRelativeVerticalPosition(RelativeVerticalPosition.PAGE);
shape.setRelativeHorizontalPosition(RelativeHorizontalPosition.PAGE);
shape.setVerticalAlignment(VerticalAlignment.CENTER);
shape.setHorizontalAlignment(HorizontalAlignment.CENTER);

doc.save("C:\\Temp\\out_centered_shape.docx");

out_centered_shape.docx (8.4 KB)

But in your case there is shape and paragraph in your original document. If you need to vertically center whole content on the page, you can specify page vertical alignment:

Document doc = new Document("C:\\Temp\\in.html");
doc.getFirstSection().getPageSetup().setVerticalAlignment(PageVerticalAlignment.CENTER);
doc.save("C:\\Temp\\out.docx");

out.docx (9.1 KB)

1 Like

Hi,

Thanks for the alignment solution. It worked for my use-case above.

1 Like

Hi,

I added some border and background color in the HTML for the image but it seems ASPOSE Words did not pick it up during the HTML to DOCX saving process.

What am I missing in the logic below?

HtmlLoadOptions options = new HtmlLoadOptions();
Document doc = new Document(htmlBytes, options);
DocumentBuilder builder = new DocumentBuilder(doc);
PageSetup pageSetup = builder.getPageSetup();

double margin = 0;
pageSetup.setLeftMargin(margin);
pageSetup.setRightMargin(margin);
pageSetup.setTopMargin(margin);
pageSetup.setBottomMargin(margin);
pageSetup.setVerticalAlignment(PageVerticalAlignment.CENTER);

OoxmlSaveOptions ooxmlSaveOptions = new OoxmlSaveOptions();
doc.save("Horizontal Image.docx", options);

Horizontal Image.zip (8.9 KB)

@oraspose As I have mentioned there is no direct analog of DIVin MS Word document object model so using such elements is limited. Partially, you can resolve this by specifying BlockImportMode.PRESERVE:

HtmlLoadOptions options = new HtmlLoadOptions();
options.setBlockImportMode(BlockImportMode.PRESERVE);

But background is not preserved anyways.
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-25135

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Thanks for the information as it is helpful in understanding the Word behavior.

1 Like

@alexey.noskov As you mentioned earlier, that there is no direct analog of DIV in MS Word DOM Model. Does that mean it can behave differently for different set of HTML input fed into the ASPOSE Document object?

I have attached a file containing 3 different test cases namely “Circular”, “Horizontal” and “Vertical”. The HTML of these files contains a DIV with Border and Background color applied, when these files are passed as input to the ASPOSE Words Document object and saved as “DOCX” format. I noticed that the only test case that fails is “Horizontal” while other 2 produces correct output.

I am just curious as to why only Horizontal test case fails. Any sort of information or justification would be helpful. Thanks.

Sample Test Cases.zip (27.6 KB)

@oraspose In your Horizontal Image with Text.html the root DIV has display:table-cell if change it to display:block, background color is preserved properly:

<div style="display:block;vertical-align:middle;text-align:center;width:474.0px;height:282.0px;border:3px dotted #000000 !important;background-color:#ccffff;">

out.docx (10.0 KB)

1 Like

Hi @alexey.noskov, the background color is only applied within the border but I was expecting the border and background-color to be applied on the Document to match the way it is shown in the Sample HTML file. I am using the logic as mentioned here Center Align Page Content.

Sample Test Case 2.zip (11.0 KB)

@oraspose As you mentioned earlier, Aspose.Words is designed to work with MS Word documents. HTML documents and MS Word documents object models are quite different and it is not always possible to provide 100% fidelity after conversion one format to another.
You can apply background to the whole document, by specifying background color on body element:

<html>
<body style="background-color: #ccffff;">
    <div>
        <p>
            <span>Test document</span>
        </p>
    </div>
</body>
</html>

Unfortunately, specifying border such way will not work.

I was looking for resolving border and background-color together but if the border is going to be a challenge then I would have to rethink about my HTML generation. Thanks for the heads up.

@oraspose If you have control over the content generation process, I would suggest you to avoid using HTML and use either directly generate content in MS Word document or use SVG instead of HTML.