We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Texts get covered by each other in result of saving in HTML format

Hi there


We currently testing saving PDF files into HTML format with Aspose PDF 17.1.0.
Here is our code:

@Test
public void asposeConvert() throws Exception {

String fileName = “sdjpstbst_p13.pdf”;
Document pdf = new Document(“custom/input/pdf/” + fileName);

HtmlSaveOptions htmlSaveOps = new HtmlSaveOptions();
htmlSaveOps.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
htmlSaveOps.FontSavingMode = HtmlSaveOptions.FontSavingModes.AlwaysSaveAsWOFF;
htmlSaveOps.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
htmlSaveOps.LettersPositioningMethod = LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
htmlSaveOps.setSplitIntoPages(false);

File file = new File(“custom/output/pdf/” + fileName);
file.mkdirs();

for (int p = 1; p <= pdf.getPages().size(); p++) {
Document pageDoc = new Document();
pageDoc.getPages().add(pdf.getPages().get_Item§);

final ByteArrayOutputStream stream = new ByteArrayOutputStream();
htmlSaveOps.CustomHtmlSavingStrategy = new HtmlSaveOptions.HtmlPageMarkupSavingStrategy() {
@Override
public void invoke(com.aspose.pdf.HtmlSaveOptions.HtmlPageMarkupSavingInfo htmlSavingInfo) {
try {
byte[] resultHtmlAsBytes = new byte[(int) htmlSavingInfo.ContentStream.available()];
htmlSavingInfo.ContentStream.read(resultHtmlAsBytes, 0, resultHtmlAsBytes.length);
stream.write(resultHtmlAsBytes);
stream.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
};

String outHtmlFile = “SomeUnexistingFile.html”;
pageDoc.save(outHtmlFile, htmlSaveOps);
IOUtils.write(stream.toByteArray(),
new FileOutputStream(“custom/output/pdf/” + fileName + “/” + p + “.html”));
}
}

In the result, there some characters stick together, and make them covered by each other.
I have also uploaded the PDF file and the result.
Please check the attachment and help us solve this issue, thank you~


Craig

Hi Criag,


Thanks for your inquriy. Please use PreventGlyphsGrouping property of HtmlSaveOptions to keep maximum precision during the conversion, it will resolve the issue. This attribute turns on the mode, when text glyphs will not be grouped into words and strings.

htmlSaveOps.setPreventGlyphsGrouping(true);

Please feel free to contact us of any further assistance.

Best Regards,