Hi
I am using Aspose PDF 17.8 to convert pdf file into HTML format
Here is my code for test:
String fileName = “Dropbox 新手指南.pdf”;
Document pdf = new Document(“custom/input/pdf/” + fileName);
new File(“custom/output/pdf/” + fileName + “/”).mkdirs();for (int p = 1; p <= pdf.getPages().size(); p++) {
Document pageDoc = new Document();
pageDoc.getPages().add(pdf.getPages().get_Item§);
pageDoc.getPageInfo().setMargin(new MarginInfo(0, 0, 0, 0));HtmlSaveOptions htmlSaveOps = new HtmlSaveOptions();
htmlSaveOps.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
htmlSaveOps.FontSavingMode = HtmlSaveOptions.FontSavingModes.AlwaysSaveAsWOFF;
htmlSaveOps.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
htmlSaveOps.LettersPositioningMethod = LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
htmlSaveOps.setSplitIntoPages(false);
htmlSaveOps.setPreventGlyphsGrouping(true);final StringBuilder htmlBuffer = new StringBuilder();
htmlSaveOps.CustomHtmlSavingStrategy = new HtmlSaveOptions.HtmlPageMarkupSavingStrategy() {
@Override
public void invoke(HtmlPageMarkupSavingInfo htmlSavingInfo) {
try {
htmlBuffer.append(IOUtils.toString(htmlSavingInfo.ContentStream, “utf8”));
} catch (Exception e) {
e.printStackTrace();
} finally {
IOUtils.closeQuietly(htmlSavingInfo.ContentStream);
}
}
};String outHtmlFile = “SomeUnexistingFile.html”;
pageDoc.save(outHtmlFile, htmlSaveOps);
IOUtils.write(htmlBuffer.toString().getBytes(“UTF-8”),
new FileOutputStream(“custom/output/pdf/” + fileName + “/” + p + “.html”));
}
Issue:
This issue can only be observed in the specific browser: Safari
In the generated result, there are some character missing.
We found this might be something to do with the rendering from css script,
but we don’t know what’s going on in there.
Dropbox 新手指南.pdf (1.1 MB)
result-and-showing-css.zip (404.9 KB)
I uploaded the PDF file, one page of the results.
Please check the attachments, and this issue. Thank you
Craig