I tried out converting an html document to docx using Aspose.Words for .net. It is working fine, but if the html contains images, it does not shows up inside the converted docx file.
Im triying to use this inside Sharepoint 2010, I will be creating the html by joining the list items, so the image url might be like “libraryname/imagename.png” or an http url.
Thanks for your request. The problem might occur because Aspose.Words cannot find the images by specified paths. Have you tried specifying full path to image to full url. I think, in this case, the images will be correctly imported into the document.
tried giving full url, that alo doesn’t seems to work
Thanks for your inquiry. Could you please attach the input HTML document (with images) you are getting problem with? I will check it on my side and provide you more information.
Please find the samle html code…
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html> <head> <title></title> </head> <body> Looks how cool is <font size="x-large"><b>Open Xml</b></font>. Now with <font color="red"><u>HtmlToOpenXml</u></font>, it nevers been so easy to convert html. <p> If you like it, add me a rating on <a href="http://notesforhtml2openxml.codeplex.com">codeplex</a> </p> <a href="http://www.wikipedia.org"> <img alt="Wikipedia, the Free Encyclopedia" src="_layouts/TestProject/Wikipedia-logo.png" /> </a> <table width="50%" align="center" border="1"> <tr> <td rowspan="2">Anime Studio</td> <td>Pixar</td> </tr> <tr> <td>Studio Ghibli</td> </tr> </table> <table width="100%" border="1"> <tr style="font-weight: bold"> <td>Studio</td> <td colspan="2">Animes</td> </tr> <tr> <td>Pixar</td> <td>The incredibles</td> <td>Ratatouille</td> </tr> <tr> <td>Studio Ghibli</td> <td>Grave of the Fireflies</td> <td>Spirited Away</td> </tr> </table> The <abbr title="World Health Organization">WHO</abbr> was founded in 1948. </body> </html>
Thank you for additional information. I cannot reproduce the problem on my side using the latest version of Aspose.Words (10.4.0). Here are my steps:
- I have created virtual directory on my machine.
- Then I have added images to this directory
- And then I have converted HTML to DOC.
Moreover I have tried using the following code, and it works fine too:
Document doc = new Document(); DocumentBuilder builder = new DocumentBuilder(doc); builder.InsertHtml(""); doc.Save("C:\\Temp\\out.doc");
I think, in your case you can try using MHTML, which is actually HTML with all resources embedded. Also, you can try storing your images as base64 in your HTML documents. Using Aspose.Words you can import and export HTML with such images: