We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Importing from HTML

Hello,

We are currently evaluating using Aspose to convert a specially formatted HTML page into a Word document, as an alternative download option for our website. I understand that CSS support for HTML import is limited. Do you have any documentation on which CSS properties you do support, and on which nodes? Finding it out through testing is quite difficult.

In addition I have two questions. First, whether it is possible to pass some information regarding CSS classes in the HTML to the imported document. Our current setup uses a template Word document, which contains static boilerplate fluff, as well as certain dummy paragraphs that are replaced by the document converted from HTML (which is dynamically generated). If the CSS classes can be somehow mapped to a Word style in the template document (through a post-processor), it would make this perhaps sad design slightly more bearable.

The second one is if I can tell the importer to create more than one sections. We might need this to give different page styles to parts of the content, for example making some pages landscape and others portrait. If the sections can then have some associated classes it would be great.

Thanks!


Rick

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for your interest in Aspose.Words.

1. Unfortunately, there is no such list in the public access. Currently we have only list of MS Word feature, which are supported upon exporting to HTML. You can find this list here:

http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/what-document-features-are-supported-1.html

In future we will add a similar list for HTML import.

2. Sure it is possible. For instance, you have HTML like the following:

<html>

<head>

<style type="text/css">

.myParagraphStyle { color:Red; }

</style>

</head>

<body>

<p class="myParagraphStyle">This is paragraph text.</p>

</body>

</html>

When you open such HTML using Aspose.Words, you will have a paragraph with “myParagraphStyle” applied. So you can create a style with the same name in your destination document and insert HTML document with ImportFormatMode.UseDestinationStyles option. For example, see the following code:

// Open source HTML.

Document src = new Document(@"Test001\Test.html");

// Open destination document. (The docuemtn contains predefined style with name "myParagraphStyle")

Document dst = new Document(@"Test001\dst.doc");

// Append source document to the destination with ImportFormatMode.UseDestinationStyles option.

dst.AppendDocument(src, ImportFormatMode.UseDestinationStyles);

// Save output.

dst.Save(@"Test001\out.doc");

3. Sure, you can insert section breaks in your HTML. You should use BR tags like the following to achieve this:

<br style="page-break-before:always; clear:both; mso-break-type:section-break" />

Also, you can specify page setup in HTML. Please see the following HTML:

<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

<meta http-equiv="Content-Style-Type" content="text/css" />

<title></title>

<style type="text/css">

@page Section1

{

margin: 56.7pt 42.5pt 56.7pt 85.05pt;

size: 612pt 792pt;

}

div.Section1

{

page: Section1;

}

@page Section2

{

margin: 85.05pt 56.7pt 42.5pt;

size: 792pt 612pt;

}

div.Section2

{

page: Section2;

}

</style>

</head>

<body>

<div class="Section1">

<p style="font-size: 11pt; line-height: 115%; margin: 0pt 0pt 10pt">

<span style="font-family: Calibri; font-size: 11pt"> </span></p>

</div>

<br style="clear: both; mso-break-type: section-break; page-break-before: always" />

<div class="Section2">

<p style="font-size: 11pt; line-height: 115%; margin: 0pt 0pt 10pt">

<span style="font-family: Calibri; font-size: 11pt"> </span></p>

</div>

</body>

</html>

Hope this helps.

Best regards,

Thanks for the reply. I found that the reason CSS classes didn’t work for us was because we were using an old version (2.6.0)…

Hi!
Why when using this approach, created the first page of the document is empty?

Best regards,
Vitaly Vasilega

Hi Vitaly,

Thanks for your request. Could you please clarify what approach you mean? Also it would be great if you provide sample document and code that will allow us to reproduce your issue. We will check the problem provide you more information.

Best regards,

Hi Alexey,

The other day
prepare the sample.


Best regards,
Vitaly Vasilega

Hi Vitaly,

Sure, I will wait for your inputs.

Best regards,

Hi Alexey,

As I promised to send you an example (asp.net mvc 3).

Best regards,
Vitaly Vasilega

Hi

Thank you for additional information. The empty page at the beginning is an empty section. When you append document to another, Aspose.Words copies all section from source document to the destination. In your case, your destination document already contains one empty section. If you do not need the appended document starts from new page, you can change your code as shown below:

public void FillingAdverseEvents(string value)

{

var s = new MemoryStream(Encoding.UTF8.GetBytes(value));

var src = new Document(s);

src.FirstSection.PageSetup.SectionStart = SectionStart.Continuous;

_document.AppendDocument(src, ImportFormatMode.UseDestinationStyles);

}

Hope this helps.

Best regards,

Hi Alexey,
Thank you. It works.

Best regards,
Vitaly Vasilega

The issues you have found earlier (filed as WORDSNET-5557) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.