Missing images in conversion of Word to HTML

Hi


I am busy evaluating Aspose. When I attempt to convert an DOC file to HTML using…

Document doc = new Document(is);
doc.save(file.getPath(),saveFormat);

the images are showing up as broken hyperlinks. How do I get the images to be encoded inline?

Much appreciate

Jamie

Hi Jamie,


Thanks for your inquiry. Could you please attach your input Word document and output HTML file here for testing? I will investigate the issue on my side and provide you more information.

Secondly, you can use HtmlSaveOptions.ExportImagesAsBase64 property to specify whether images are saved in Base64 format to HTML. When this property is set to true image data is exported directly on the img elements and separate files are not created:

Please let us know if this solves your problem.

Best regards,

Thanks. The HtmlSaveOptions.ExportImagesAsBase64 property was exactly what I needed. I have Doc->HTML, Excel->HTML working perfectly. I only need to find a way to get Aspose to convert PDF->HTML in Java. The docs do not correlate with the API.

Hi Jamie,

Thanks for your inquiry. As this problem is related to Aspose.Pdf, I am moving your thread in Aspose.Pdf forum. My colleagues from Aspose.Pdf team will answer you shortly.

Best regards,

Hi Jamie,


Thanks for contacting support.

Aspose.Pdf for Java does not support the feature to convert PDF file to HTML format. However in order to accomplish your requirements, you may consider extracting the text from PDF file and save the output in HTML format. Please try using the following code snippet


[Java]

try<o:p></o:p>

{

PdfExtractor extractor = new PdfExtractor();

extractor.bindPdf("c:/pdftest/Aspose_Test.pdf");

extractor.setStartPage(1);

extractor.setEndPage(2);

extractor.extractText();

extractor.extractTextAsHTML("c:/pdftest/Aspose_Test.html");

extractor.close();

}catch(Exception ex)

{System.out.println(ex);

}