Aspose.Word does not honor <base> tag when importing HTML to a Word document

Hello,

We have an issue when creating a Word document from an HTML stream. The issue is regarding the html tag, where the resulting link on the Word document is broken because it doesn't contain the value from the base tag.

We are using Aspose Words version 15.6.0.

Here is the Java code snippet that reproduces this problem:

public void toAsposeWord() throws Exception {

String html = "" +
"" +
"<base href=\"http://localhost:8080/application/\">" +
"" +
"" +
"We are expecting a full url for the link!
"
+
"<a href=\"summary?fromPage=1&toPage=77\">Page summary" +
"" +
"";

InputStream contentStream = new ByteArrayInputStream(html.getBytes(Charset.forName("UTF-8")));

Document doc = new Document(contentStream);

doc.save("/word.doc");
}

And here is the html again so it is easier to read:


We are expecting a full url for the link!




Any feedback is appreciated as to how to resolve fix this issue.

Thank you for your help.
-M


Hi Miguel,

Thanks for your inquiry. But when you save this .html to .docx format using Microsoft Word 2013, you'll observe the same behavior. Please see attached Microsoft Word 2013 generated .docx document. So, this seems to be an expected behavior as Aspose.Words mimics Microsoft Word in this case. If we can help you with anything else, please feel free to ask.

Best regards,

Hello Awais,


I was under the impression that loading HTML into a Word document using Aspose already does a lot of extra work to get the resulting Word document to match the HTML rendering, is that correct? In that case doesn’t it seem appropriate that the links would take into account the BASE tag when importing into a Word document the same as the HTML document does?

-Dylan Gulick
Jama Software

Hi Dylan,


Thanks for the additional information. We have logged your requirement in our issue tracking system as WORDSNET-12286. Our product team will further look into the details of this problem and we will keep you updated on the status of this issue. We apologize for any inconvenience.

Best regards,

The issues you have found earlier (filed as WORDSNET-12286) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

The issues you have found earlier (filed as WORDSNET-12286) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(1)