We are using aspose.words java for converting html to docx format. But we are facing few issues in this.
- Alignment and Styles are not rendered properly in the converted docx output file
- Page Break tag is not supporting
- The font is also not the same as html
Could you help with this issue?
Input Html fileinput.zip (22.0 KB)
Output File Output Docx - Aspose - AN.docx (30.6 KB)
@srinivasr I have checked your HTML and output document generated by the latest 22.2 version of Aspose.Words for Java and cannot reproduce the problems on my side:
In the HTML document alignment is specified as
text-align: justify in the output document paragraph alignment is properly specified as
Your HTML document does not have page break. To make break to be imported as page break it is required to specify special styles, like in the following example:
<br style="page-break-before:always; clear:both" />
I see in your document font is set to
sans-serif, but with the latest version font is properly set to
Please find the document generated on my side: out.docx (30.6 KB)
Thanks for the quick reply @alexey.noskov
Still I do have few doubts
- If the font specified in HTML is not present in the system is there any ways we could load the font file. I guess in my case this is probably the issue?
- In the output document which you attached i can see the table and the text are not aligned to the body. Is there any possible ways to format it?
Image for reference : image.png (17.9 KB)
It is not required to physically have fonts upon conversion from HTML to DOCX. Have you tried converting your document on your side using the latest version of Aspose.Words.
The reason of the problem are paddings set in your HTML. If remove paddings the content is imported properly.