In my project I have word fromatted html documents in a database the content was written with Free Text Box. I have moved from using word Automation to insert the html into a document template to aspose.words to create the document on the server side.
I am experencing a formatting issue when reading the word formatted html into aspose.words in areas where there should be a tab it is replaced by ~ 8 spaces.
Hello!
Thank you for your request.
Tabulation and tab positions are always a problem in HTML. Standard HTML 4.01 and CSS 2.1 have nothing to do with them. Microsoft Word maintains such features with its specific attributes such as mso-tab-count. They are not proprietary but other applications may not support them and many people don’t like to deal with Microsoft Office “magic” in HTML. Our main goal is working with “pure” HTML without such “magic”. Yes, we roundtrip some features as exceptions to this rule. For instance we export and import section breaks via mso-break-type. There is no other means to do that. Some other “magic” features are optional, such as exporting document properties.
Possible workaround is pre-processing or post-processing documents. If these tabs are produced by some application or control that you cannot tune up then this would be the only chance.
I have linked your request to the appropriate known issue. We’ll notify you on any progress. Please feel free to share any feedback, it’s much appreciated.
Regards,
Is there any method that I can send a tab /t to aspose.word by modifying words magic html string?
I have also tried converting the document with aspose.word via the SaveType html function but this has even worse formatting. It appers that the SaveType uses FreeTextBox as well but the formatting comes out differntly then when I pass the rich text directly to FTB and convert it to html.
Thanks for the prompt responce.
There is no way to force a tabulation by outputting ‘/t’ character directly to HTML. You can output it but any number of tabs and spaces between words will be replaced by exactly one space on building document layout. Microsoft Word simulates tabs with padding sequences besides writing its own attributes. This is needed to view documents in any browser that has nothing to do with special “magic” attributes. Each padding sequence consists of non-breaking spaces plus one ordinary space. Number of characters is calculated according some rules, in general unknown. I tried simulating tabs the same way in HTML export module but this is a very complex task. To do this we have to calculate all tab positions in a paragraph and measure width of all its parts between tabs. Hopefully you can somehow tune up Free Text Box control or post-process its output to produce similar layout but without tabs. This doesn’t cover all cases but if you show me a sample more than of one line then I would try to invent a programmatic workaround. Maybe table-driven approach could be applied here or you could try spans with “display:inline-block” CSS attribute.
Regards,
Previously I was using word vba to insert the htlm data into the bookmarks. when the html file is opened with word and converted from html the tabs are present in the document. the mso-tab-count is stripped out by aspose.words
Hhere is the html segment we have for the tab
Hello!
Thank you for your experience.
This workaround will work until we change the way we output tabs. So you should check the results next time you upgrade Aspose.Words.
Regards,