Hi there
I am using Aspose PDF to covert PDF file into HTML format, with font substitution.
Here is the code I used for test:
Test Case:
@Test
public void asposeConvert() throws FileNotFoundException, IOException {// create font sub rule and TestFontSubRule subst = new TestFontSubRule(); FontRepository.getSubstitutions().add(subst); String fileName = "10mincsiegraduate-160608073616.pdf"; Document pdf = new Document("custom/input/pdf/" + fileName); File dir = new File("custom/output/pdf/" + fileName + "/"); dir.mkdirs(); HtmlSaveOptions htmlSaveOps = new HtmlSaveOptions(); htmlSaveOps.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground; htmlSaveOps.FontSavingMode = HtmlSaveOptions.FontSavingModes.AlwaysSaveAsWOFF; htmlSaveOps.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml; htmlSaveOps.LettersPositioningMethod = LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss; htmlSaveOps.setSplitIntoPages(false); htmlSaveOps.setPreventGlyphsGrouping(true); for (int p = 1; p <= pdf.getPages().size(); p++) { Document pageDoc = new Document(); pageDoc.getPages().add(pdf.getPages().get_Item(p)); final StringBuilder htmlBuffer = new StringBuilder(); htmlSaveOps.CustomHtmlSavingStrategy = new HtmlSaveOptions.HtmlPageMarkupSavingStrategy() { @Override public void invoke(com.aspose.pdf.HtmlSaveOptions.HtmlPageMarkupSavingInfo htmlSavingInfo) { try { htmlBuffer.append(IOUtils.toString(htmlSavingInfo.ContentStream, "utf8")); } catch (FileNotFoundException e) { } catch (IOException e) { } finally { IOUtils.closeQuietly(htmlSavingInfo.ContentStream); } } }; String outHtmlFile = "SomeUnexistingFile.html"; pageDoc.save(outHtmlFile, htmlSaveOps); String html = htmlBuffer.toString(); IOUtils.write(html.getBytes("utf8"), new FileOutputStream("custom/output/pdf/" + fileName + "/" + p + ".html")); }
}
TestFontSubRule class
public class TestFontSubRule extends CustomFontSubstitutionBase {
public boolean trySubstitute(
CustomFontSubstitutionBase.OriginalFontSpecification originalFontSpecification, /* out */
com.aspose.pdf.Font[] substitutionFont) {
System.out.println(originalFontSpecification.getOriginalFontName());
if (originalFontSpecification.getOriginalFontName().contains(“DFKaiShu”)) {
substitutionFont[0] = FontRepository.findFont(“HanWangMingLight”);
return true;
} else {
return false;
}
}
}
I met a PDF file, and there are some wrong characters in its result.
I uploaded the result, the PDF file, and the comparison image.
Please check the attachment and also this issue, thank you~
10mincsiegraduate-160608073616.pdf (207.0 KB)
comparison_page#2.JPG (50.0 KB)
(Result page files. Rename them like “*.zip.001” to unzip them)
10mincsiegraduate-160608073616.pdf.001.zip (3 MB)
10mincsiegraduate-160608073616.pdf.002.zip (3 MB)
10mincsiegraduate-160608073616.pdf.003.zip (3 MB)
10mincsiegraduate-160608073616.pdf.004.zip (3 MB)
10mincsiegraduate-160608073616.pdf.005.zip (532.6 KB)
Craig