Texts are convered by image after converting to HTML

Hi there

I am using Aspose Word 16.6.0 to convert Word Files to HTML files.

I found that some of texts on the result html would be covered by the image.
But in MS Word, no matter how you adjust the margin or the size of the page, the image won’t cover the texts

Here is my code to test:

Document doc = new Document("custom/input/docx/20160725_2.docx");
Document pageDoc;
LayoutCollector layoutCollector;
DocumentPageSplitter splitter;
System.out.println(doc.getPageCount());

layoutCollector = new LayoutCollector(doc);
doc.updatePageLayout();
splitter = new DocumentPageSplitter(layoutCollector);

HtmlSaveOptions saveOp = new HtmlSaveOptions();
saveOp.setExportTextInputFormFieldAsText(false);
saveOp.setExportTocPageNumbers(true);
saveOp.setExportPageSetup(true);
saveOp.setExportDocumentProperties(true);
saveOp.setExportRelativeFontSize(false);

for (int p = 1; p <= doc.getPageCount(); p++)
{
    pageDoc = splitter.GetDocumentOfPage§;
    pageDoc.save("custom/input/docx/20160725_2.docx." + p + ".html", saveOp);
}

And please check the attachment, thanks

Hi Craig,

Thanks for your inquiry. Please note that MS Word format and HTML formats are quite different so sometimes it’s hard to achieve 100% fidelity. Upon processing HTML, some features of HTML might be lost. You can find a list of limitations upon HTML exporting/importing here:

Load in the HTML (.HTML, .XHTML, .MHTML) Format
Save in the HTML (.HTML, .XHTML, .MHTML) Format

Aspose.Words mimics the same behavior as MS Word does. If you convert your document to html using MS Word, you will get the same output.

Could you please share some more detail about your query? We will then provide you more information on this.

Hi

Please see the attached image for comparison.
If you lessen the Html’ s width to almost the same as the origin file, you can see that the image make texts covered.

Hi Craig,

Thanks for sharing the detail. Please note that Aspose.Words mimics the same behavior as MS Word does. If you convert your document to Html using MS Word, you will get the same output. We suggest you please save your document to HtmlFixed file format as shown in following code snippet. Hope this helps you.

HtmlFixedSaveOptions saveOp = new HtmlFixedSaveOptions();

for (int p = 1; p <= doc.getPageCount(); p++)
{
    pageDoc = splitter.GetDocumentOfPage(p);
    pageDoc.save("custom/input/docx/20160725_2.docx." + p + ".html", saveOp);
}

Hi Tahir Manzoor

Thanks for your advise! This actually works. The image doesn’t cover the text.

But there is a need from our client.

The reason we use HtmlSaveOption instead of HtmlFixedSaveOption is that the text will automatically collapse or expand, when resizing the window.
For that our client don’t need to scroll left/right, only to scroll up and down, which is more convenient for mobile devices

Is there any way for HtmlFixedSaveOption to achieve this, and solve the image covering problem, too?

Hi Craig,

Thanks for sharing the detail. You are facing the expected behavior of Aspose.Words. Your document contains the GroupShape, with absolute position, and it’s the first node of document. To get the desired output, please insert GroupShape after first paragraph with inline wrap type, and remove empty paragraphs before text “圖4-1:研究架構圖”.

Hope this helps you.

Hi Tahir Manzoor

With the following code in the for loop of pages, we can generate the desired output:

for (int i = 0; i < pageDoc.getChildNodes(NodeType.GROUP_SHAPE, true).getCount(); i++)
{
    Node node = pageDoc.getChildNodes(NodeType.GROUP_SHAPE, true).get(i);
    if (node instanceof GroupShape) {

    GroupShape s = (GroupShape)node;

    if (s.getWrapType() == WrapType.NONE)
    {

        GroupShape newS = (GroupShape)s.deepClone(true);
        CompositeNode parent = s.getParentNode();
        newS.setWrapType(WrapType.TOP_BOTTOM);

        parent.removeChild(s);
        parent.appendChild(newS);
    }
}

Furthermore, there is still a question from our QA team.
If the author intends to make the text covered like the doc newly uploaded in the attachment, this can be determined programmatically?

Hi Craig,

Thanks for your inquiry. Shape.BehindText specifies whether the shape is below or above text.

You can check the shape’s position by using Shape.RelativeHorizontalPosition, Shape.RelativeVerticalPosition, Shape.HorizontalAlignment, Shape.VerticalAlignment, Shape.Left and Shape.Top properties. Please check the members of shape’s class from here:
https://reference.aspose.com/words/java/com.aspose.words/shape/