Html to Word in SharePoint 2013

Hello,

We are evaluating Aspose for suitability in converting Html entered in a rich text editor to word. We use TinyMCE for the RTE. Please note that all actions are performed in memory with no access to the file system. Here are the questions we have:

  1. We would like to use a template to base the word document on. How can we have Aspose write to an existing word document?
  2. How do we attach a custom CSS to define the styles used for creating a word document?
  3. Does Aspose support both inline and external css files?
  4. Some of the issues we notice is around alignment and spacing. Seeattached image and html used for example

Do you have any best practices for scenarios that involve converting from Html to word with Aspose or any suggestions on how to resolve our issues

Thanks

Hi Rajesh,

Thanks for your interest in Aspose.Words for .NET api. Yes, you can easily load documents from memory and after processing save documents back to memory using Aspose.Words.

1 - After loading document from memory, you can move cursor to any desired place in document and then insert content for example HTML using DocumentBuilder.InsertHtml(“html string”) method.

2 - You can pre-process your html string as follows before passing it to Aspose.Words:

<html>
<head>
    <title>Aspose.Words</title>
    <style type="text/css">
        .style1 {
            color: red;
            font-size: larger;
        }
    </style>
</head>
<body>
    your html string goes here
</body>
</html>

Or alternatively use DOM classes of Aspose.Words such as Style class to build new styles and apply to newly inserted html content.

3 - Yes. The following three ways are supported

  • When CSS styles are written inline (as a value of the style attribute on every element).
  • When CSS styles are written separately from the content in a style sheet embedded in the HTML file.
  • When CSS styles are written separately from the content in a style sheet in an external file such that the HTML file links the style sheet.

Rajesh:
Some of the issues we notice is around alignment and spacing. Seeattached image and html used for example. Do you have any best practices for scenarios that involve converting from Html to word with Aspose or any suggestions on how to resolve our issues

Aspose.Words generally tries to mimic the behavior of MS Word. we have converted your html to docx format using MS Word 2016 and attached it here for your reference (see attached msw-2016.docx). However, for the sake of any corrections in Aspose.Words, we have logged this problem in our issue tracking system as WORDSNET-12808. Our product team will further look into the details of this problem and we will keep you updated on the status of correction. We apologize for any inconvenience.

Best regards,

Hi,

Regarding WORDSNET-12808, our product team has completed the work on your issue and has come to a conclusion that this issue and the undesired behaviour you’re observing is actually not a bug. So, we will close this issue as ‘Not a Bug’.

Aspose.Words tries to mimic the behavior or MS Word. But, in this case, Aspose.Words intentionally imports list item markers differently than MS Word in order to make the result look closer to the source HTML document. For example, when the attached HTML document is viewed in a browser, bullets of the first list are in fact not aligned with numbers of the second list, and there is large space between numbers and text in the third list. Aspose.Words does preserve those formatting aspects while MS Word fails to do so.

We believe formatting of list items in the document produced by Aspose.Words looks better than MS Word and we shouldn’t change anything in current Aspose.Words’ behavior. If we can help you with anything else, please feel free to ask.

Best regards,

Thanks for the quick response. Does Aspose support loading background images defined in css when converting from html to word? If you do, can you provide an example?

Thanks

Hi,

Thanks for your inquiry. It should work. However, please zip and attach your sample input HTML/CSS files here for testing. We will investigate the scenario on our end and provide you more information.

Best regards,

Thanks Awais. I attached the html file, css and how the link should render in word. Please note that the server requires authentication to download any items. I implemented the IResourceLoadingCallback interface to allow for any downloads from the server. I see images in the body being requested and stylesheets but not the images referenced in the css. Example:

background-image: url("/_layouts/15/xxx/common/img/info.png");

I included the BaseUri attribute when I instantiate the Document class

Thanks

Hi,

Thanks for your inquiry. But when you save this .html to .docx format using Microsoft Word 2016, you’ll observe the same behavior. Please see attached Microsoft Word 2016 generated .docx document here with this post. So, this seems to be an expected behavior as Aspose.Words mimics Microsoft Word in this case. If we can help you with anything else, please feel free to ask.

Best regards,