Text shifted in the result of saving a PDF file into HTML format

Hi there


I am using Aspose PDF 17.2.0 to save pdf files into HTML format.
Here is my code for testing:

String fileName = “Dropbox 新手指南.pdf”;

Document pdf = new Document(“custom/input/pdf/” + fileName);

HtmlSaveOptions htmlSaveOps = new HtmlSaveOptions();
htmlSaveOps.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
htmlSaveOps.FontSavingMode = HtmlSaveOptions.FontSavingModes.AlwaysSaveAsWOFF;
htmlSaveOps.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
htmlSaveOps.LettersPositioningMethod = LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
htmlSaveOps.setSplitIntoPages(false);

File f = new File(“custom/output/pdf/” + fileName + “/”);
f.mkdirs();

for (int p = 1; p <= pdf.getPages().size(); p++) {
Document pageDoc = new Document();
pageDoc.getPages().add(pdf.getPages().get_Item§);

final ByteArrayOutputStream stream = new ByteArrayOutputStream();
htmlSaveOps.CustomHtmlSavingStrategy = new HtmlSaveOptions.HtmlPageMarkupSavingStrategy() {
@Override
public void invoke(com.aspose.pdf.HtmlSaveOptions.HtmlPageMarkupSavingInfo htmlSavingInfo) {
try {
byte[] resultHtmlAsBytes = IOUtils.toByteArray(htmlSavingInfo.ContentStream);
htmlSavingInfo.ContentStream.read(resultHtmlAsBytes, 0, resultHtmlAsBytes.length);
stream.write(resultHtmlAsBytes);
stream.close();
} catch (FileNotFoundException e) {
} catch (IOException e) {
} finally {
IOUtils.closeQuietly(htmlSavingInfo.ContentStream);
}
}
};

String outHtmlFile = “SomeUnexistingFile.html”;
pageDoc.save(outHtmlFile, htmlSaveOps);
IOUtils.write(stream.toByteArray(),
new FileOutputStream(“custom/output/pdf/” + fileName + “/” + p + “.html”));
}

For instance, In the result of page #2 , some parts of text are disappeared, but actually, they are shifted very far away at the right side.
Please check the pdf file and the result in the attachment, and analyze this issue.
There should be more more case of this issue in other result pages also.

Craig

Hello Craig,


Thanks for contacting support.

I have tested the scenario in our environment with latest release Aspose.Pdf for Java 17.4 and was able to reproduce the issue which you have mentioned. Hence, I have logged this issue as PDFJAVA-36804 in our internal issue tracking system. We will further look into the details and keep you posted with the status of resolution. Please be patient and spare us little time.

We are sorry for the inconvenience.


Best Regards,

The issues you have found earlier (filed as PDFJAVA-36804) have been fixed in Aspose.PDF for Java 21.3.