We are currently leveraging the Converter.ConvertHTML to create a DOCX file from HTML content.
// Use a MemoryStreamProvider to handle the in-memory conversion
var streamProvider = new MemoryStreamProvider();
// Initialize an HTML document
var document = new HTMLDocument(html.Model, ".");
var pdfSaveOptions = new DocSaveOptions
{
PageSetup =
{
AdjustToWidestPage = true,
PageLayoutOptions = Aspose.Html.Rendering.PageLayoutOptions.ScaleToPageWidth,
AnyPage = new Page(new Size(request.Width, request.Height))
}
};
// Convert HTML to DOCX using the MemoryStreamProvider
Converter.ConvertHTML(document, pdfSaveOptions, streamProvider);
When a DOCX file is created and opened, it opens in compatibility mode.
When the file is converted (File > Info > Convert) all text content is lost.
Similarly, when the file is opened on a MAC device no text content is present in the DOCX file.
We have tried the following to diagnose / resolve this issue:
- Explicitly setting the FontEmbeddingRule = Aspose.Html.Rendering.Doc.FontEmbeddingRule.Full
*Utilising font and back up fonts in the CSS of the HTML
body {
font-family: Helvetica, sans-serif;
font-size: 12px;
margin: 0;
padding: 0;
padding-bottom: 50px;
page-break-before: always;
box-sizing: border-box;
}
- Removing all CSS all together.
- Simplifying the HTML content to only contain one element.
- Explicitly setting the Explicitly setting the DocumentFormat = Aspose.Html.Rendering.Doc.DocumentFormat.DOCX
This is now causing a business impact to one of our clients.
Any advice or assistance in this matter will be greatly appreciated.
Kind regards
Scotty
@Scott.Robson
It seems you are experiencing issues with the conversion of HTML to DOCX using Aspose.HTML, particularly with the resulting DOCX file opening in compatibility mode and losing text content. Here are some suggestions to help you troubleshoot and resolve this issue:
-
Ensure Proper Document Format: Make sure you are explicitly setting the document format to DOCX in your DocSaveOptions
. This is crucial as it ensures that the output file is recognized as a DOCX file. You can do this by setting options.DocumentFormat = Rendering.Doc.DocumentFormat.DOCX;
.
-
Check HTML Content: Simplifying your HTML content is a good approach. Ensure that the HTML is well-formed and does not contain any unsupported tags or attributes that might cause issues during conversion. You mentioned trying a single <p>
element; ensure that this element contains valid text.
-
Font Embedding: Since you are already setting the FontEmbeddingRule
to Full
, ensure that the fonts specified in your CSS are available on the system where the DOCX is being opened. If the fonts are not available, it may lead to rendering issues.
-
Test with Different CSS: While you have tried removing CSS, consider testing with minimal CSS that only includes essential styles. Sometimes, complex styles can interfere with the conversion process.
-
Use a Different Output Method: Instead of using a MemoryStreamProvider
, try saving the DOCX file directly to a file path. This can sometimes yield different results and help identify if the issue is related to the in-memory handling.
Here is a simplified example of how you might structure your code:
using Aspose.Html.Converters;
using Aspose.Html.Saving;
// Initialize an HTML document
var htmlDocument = new HTMLDocument("path/to/your/input.html");
// Set up save options
var docSaveOptions = new DocSaveOptions
{
DocumentFormat = Rendering.Doc.DocumentFormat.DOCX,
FontEmbeddingRule = Aspose.Html.Rendering.Doc.FontEmbeddingRule.Full
};
// Convert HTML to DOCX
Converter.ConvertHTML(htmlDocument, docSaveOptions, "path/to/your/output.docx");
- Check Compatibility Settings: If the DOCX file opens in compatibility mode, it may be due to the way the document is structured. Ensure that the conversion process does not introduce any legacy elements that could trigger compatibility mode.
If these suggestions do not resolve the issue, consider reaching out to Aspose support with specific details about your HTML content and the environment in which you are testing. This will help them provide more targeted assistance.
If you need further help, feel free to ask!