I must use poi.
I want to convert aspose’s word to html by poi.However in poi,the toc of aspose’s word is end of “\u0015\u0015”.It make html only have one .
In Microsoft word,the end is two run:"\u0015""\u0015".how can I do?
aspose:
8.18一 一一一一一一一一 一一一:一一一一一一一一一一(苹果) 1
aspose use poi generate html: <p class="p10"> <span> HYPERLINK \l "_Toc256000002" </span><span class="s4">8.18一 一一一一一一一一 一一一:一一一一一一一一一一(苹果)</span><a href="#_Toc256000002"><span> 1</span></a> </p>
end: image.png (12.2 KB)
word:
poi generate html: <p class="p6"> <a href="#_Toc65677669"><span class="s4">8.18</span><span class="s4">一 一一一一一一一一 一一一:一一一一一一一一一一(苹果)</span><span> </span></a><a href="#_Toc65677669"><span>1</span></a> </p>
the end is: image.png (5.6 KB) image.png (7.0 KB)
I need two end,but aspose only one.
Your expected HTML file showing the desired output. You can create this document manually by using MS Word.
A standalone simple Java application (source code without compilation errors) that helps us to reproduce your current problem on our end and attach it here for testing. Please do not include Aspose.Words JAR files in it to reduce the file size.
As soon as you get these pieces of information ready, we will start investigation into your scenario/issue and provide you more information.
I use aspose generate word:aspose generate.zip (59.0 KB)
poigenerate/.html:the word convert to html by poi(changepoi2html)
openoffice.html:the word convert to html by openoffice 4.1.3
the word update toc by MS office:this is my expected word. aspose generate and MS word update toc.zip (108.0 KB)
the tool of poi:it can convert aspose to html by poi and remove pagenum in TOC.
the problem of poi as above.
the problem of openoffice is :How to set lang of Run?
aspose’s html have one <span></span>,and MS word have two.
I need that the word generated by aspose can work well in two application.
I want to convert aspose to html and remove pagenum.
I don’t know the difference of two word,and how can I do ?
Thank you very much!
You had shared a DOC file contained inside “aspose generate.zip” (see source.zip (42.6 KB)) which produced an undesired behavior when I converted it to HTML format by using the following simple Java code of POI 5.0.0:
HWPFDocument wordDocument = new HWPFDocument(
new FileInputStream("C:\\Temp\\226429\\in.doc"));
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(newDocumentBuilder().newDocument());
wordToHtmlConverter.processDocument(wordDocument);
Document htmlDocument = wordToHtmlConverter.getDocument();
OutputStream outStream =
new FileOutputStream("C:\\Temp\\226429\\poi out.html");
DOMSource domSource = new DOMSource(htmlDocument);
StreamResult streamResult = new StreamResult(outStream);
TransformerFactory factory = TransformerFactory.newInstance();
Transformer serializer = factory.newTransformer();
serializer.setOutputProperty(OutputKeys.ENCODING, "utf-8");//cmsConfig.getEncoding()
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty(OutputKeys.METHOD, "html");
serializer.transform(domSource, streamResult);
outStream.close();
Problematic HTML produced by above code is as follows: