HTML import detects target theme- but does not use theme style

Hello,

I am trying to import some HTML code into a docx-document. No CSS is used (yet).
As can be seen in the attached samples, the themes of the docx are matched (e.g. heading 1 => "heading 1") but the style information is ignored.
By clicking on the theme in Word the heading/paragraph is formatted as expected.
How can I achieve this automatically?
Beware: Clicking the theme while selecting a list item will remove the list. This should be avoided in an automated solution.

When I use "appendDocument(...html)" instead of insertHtml, the result looks a bit better. However, there is still some unexpected formatting. Furthermore, some themes are changed to reflect the current html (instead of applying the theme to the html).
doc.appendDocument(docHtml, ImportFormatMode.USE_DESTINATION_STYLES);

- Aspose-Version: 15.6.0 / 15.7.0 (evaluation)
- Samples attached
- Documentation does not seem to reflect this scenario (e.g. http://www.aspose.com/docs/display/wordsjava/Text+Features+Supported+on+HTML+Import)

public static void main(String[] args) throws Exception {
    String html = "" +
"

heading no 1 with some text, no style info

\n" +
"\n" +
"

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Maecenas condimentum tortor quis tortor tincidunt, a luctus orci consequat. Pellentesque cursus justo in ullamcorper congue. Sed auctor, est non mattis maximus, sem felis aliquet justo, tincidunt efficitur tellus mi quis purus. Mauris facilisis rhoncus sapien, sit amet molestie mi dignissim sed. Integer fringilla lectus nisl. Sed faucibus molestie nibh ut blandit. Integer tempus lorem at nisl laoreet pretium eget ut sapien. Sed porta neque eu magna suscipit ullamcorper. Sed non nulla et arcu blandit interdum et a justo. Proin eu lectus tortor. Duis molestie, massa eu euismod tincidunt, felis ex convallis erat, semper sollicitudin sem massa ac libero. Phasellus viverra enim ac velit pulvinar, eget eleifend felis ultricies. Sed in lacinia metus. Nullam laoreet dui sit amet pharetra condimentum. Integer eget urna ex. Praesent a consequat diam, non blandit ipsum.

\n" +
"\n" +
"

Some other Text in new Line

\n" +
"\n" +
"

Subheading no (heading 2)

\n" +
"\n" +
"
    \n" +
    "\t
  1. List opt #1
  2. \n" +
    "\t
  3. List opt #2
  4. \n" +
    "\t
  5. List opt #3\n" +
    "\t
      \n" +
      "\t\t
    1. List opt #3.1
    2. \n" +
      "\t\t
    3. List opt #3.2
    4. \n" +
      "\t
    \n" +
    "\t
  6. \n" +
    "
\n" + "";

Document doc = new Document("Lorem Ipsum - Zweite Seite.docx"");

DocumentBuilder builder = new DocumentBuilder(doc);

builder.moveToDocumentEnd();

builder.insertHtml(html);

doc.save("CkSample4Aspose.docx");
}



Hi Tahir,


thank you for your extensive reply. I now understand how inserting HTML is working in Aspose.

insertHtml (useBuilderFormatting: false):
“Element Style” > “CSS Class + Global CSS” > “Html Defaults”
(DocX-Themes etc. are identified, but largely ignored)

insertHtml (useBuilderFormatting: true):
“Element Style” > “CSS Class + Global CSS” > “Builder Formatting” > "Docx Themes"

appendDocument(new Document(“htmlcode.html”))
Causes a mixed result since “new Document” imports in empty document. Result seems to be a mix of both inserts - but probably not necessary in my use case

It is important that any matching CSS will always override the builder formatting. E.g. the following CSS class will always cause any inserted HTML to be in Arial.
body
{
font-family: Arial;
}

Thanks,
Alexander

Hi Alexander,

Thanks
for your inquiry. Please note that Aspose.Words mimics the same behavior as MS Word does.
a.bender:

appendDocument(new Document(“htmlcode.html”))
Causes a mixed result since “new Document” imports in empty document. Result seems to be a mix of both inserts - but probably not necessary in my use case

Please read following documentation links about how appendDocument works and difference between ImportFormat modes.
http://www.aspose.com/docs/display/wordsjava/How+the+AppendDocument+Method+Works
http://www.aspose.com/docs/display/wordsjava/Differences+between+ImportFormat+Modes
a.bender:

It is important that any matching CSS will always override the builder formatting. E.g. the following CSS class will always cause any inserted HTML to be in Arial.
body { font-family: Arial; }

In both cases either useBuilderFormatting is false or true, the inserted text will be Arial. This is explained in my previous post

Hi Tahir,


thank you for your information.

have one more question on this topic. If I import the following HTML, everything is imported correctly except for the

inside the table.


You can see the difference if you compare it to the other headline. The headline should be red like the other headline. All other formatting is correctly applied.

Best regards,
Alexander
@Test
public void shouldImportNoCssTableIntoDocxTemplate() throws Exception {

Path tempDirPath = Files.createTempDirectory(“shouldImportNoCssTableIntoDocxTemplate”);
String tempDir = tempDirPath + File.separator;
System.out.println(tempDirPath);

Document doc = new Document(getResourceAsStream(RESOURCE_DOTM_NORMALDOT));
DocumentBuilder builder = new DocumentBuilder(doc);

String html = IOUtils.toString(getResourceAsStream(RESOURCE_HTML_NOCSS_TABLE));

builder.insertHtml(html, /* useBuilderFormatting */ true);

doc.save(tempDir + “out.docx”);
doc.save(tempDir + “out.pdf”);
}

<html>
<head>
<title>Tabelle</title>
<style>
table, th, td {
border: 1px solid black;
}
</style>
</head>
<body>
<p><h1>Einfache Tabelle in HTML</h1></p>
<table>
<tr>
<th>
Ueberschrift 1
</th>
<th>
Ueberschrift 2
</th>
</tr>
<tr>
<td>
Zelle 1!
</td>
<td>
<h1>Zelle 2 als Ueberschrift h1 </h1>
</td>
</tr>
</table>
</body>
</html>

Hi Alexander,

Thanks for your inquiry. I have tested the scenario and have managed to reproduce the same issue at my side. For the sake of correction, I have logged this problem in our issue tracking system as WORDSNET-12418. I have linked this forum thread to the same issue and you will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

The issues you have found earlier (filed as WORDSNET-12418) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.