I am currently evaluating Aspose.Words while looking for a method for converting HTML to Word and PDF that allows me to using a single Word template file for both in order to provide headers, footers, etc., thus Word as an intermediate representation when converting to PDF. The conversion from HTML to Word is working reasonable well (only some minor style issues), and the images are embedded in the resulting Word docx file. However, they do not show up in the PDF file (there is a red “X” placeholder instead of the image). The following code is an indication of the process I am using to generate Word and PDF output, and for loading images. Suggestions?
public static byte[] toWord(byte[] input, String baseURL, String format) throws Exception {
Document doc = loadReport(input, baseURL, format);
// Save to DOCX output stream
ByteArrayOutputStream outputStr = new ByteArrayOutputStream();
SaveOptions saveOptions = SaveOptions.createSaveOptions(SaveFormat.DOCX);
saveOptions.setPrettyFormat(true); // makes output human readable
doc.save(outputStr, saveOptions);
return(outputStr.toByteArray());
}
public static byte[] toPDF(byte[] input, String baseURL, String format) throws Exception {
Document doc = loadReport(input, baseURL, format);
// Save to PDF output stream
ByteArrayOutputStream outputStr = new ByteArrayOutputStream();
SaveOptions options = SaveOptions.createSaveOptions(SaveFormat.PDF);
doc.save(outputStr, options);
return(outputStr.toByteArray());
}
public static Document loadReport(byte[] input, String baseURL, String format) throws Exception {
LoadOptions loadOptions = new LoadOptions(LoadFormat.DOCX, “”, baseURL);
InputStream templateStr = loadOptions.getClass().getResourceAsStream("/resources/asposeReportTemplate.docx");
Document template = new Document(templateStr, loadOptions);
Document report = loadHTML(input, baseURL);
// Append report to template
template.appendDocument(report, ImportFormatMode.KEEP_SOURCE_FORMATTING);
template.updateFields();
return template;
}
// Load HTML to Document
public static Document loadHTML(byte[] input, final String baseURL) throws Exception {
ByteArrayInputStream inputStr = new ByteArrayInputStream(input);
LoadOptions options = new LoadOptions(LoadFormat.HTML, “”, baseURL);
// Image loader
options.setResourceLoadingCallback(new IResourceLoadingCallback() {
public int resourceLoading(ResourceLoadingArgs args) {
if(args.getResourceType() == ResourceType.IMAGE) {
String url = baseURL + args.getOriginalUri();
byte[] imageData = readFromURL(url);
if(imageData != null) {
args.setData(imageData);
return ResourceLoadingAction.USER_PROVIDED;
} else {
return ResourceLoadingAction.SKIP;
}
} else {
return ResourceLoadingAction.DEFAULT;
}
}
});
return new Document(inputStr, options);
}
public static byte[] readFromURL(String url) {
try {
URL U = new URL(url);
if(U != null) {
HttpURLConnection conn = null;
conn = (HttpURLConnection)U.openConnection();
conn.setConnectTimeout(5000);
conn.setReadTimeout(30000);
conn.setDoInput(true);
byte[] data = new byte[4096];
InputStream input = conn.getInputStream();
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int count = input.read(data);
while(count > -1) {
bos.write(data, 0, count);
count = input.read(data);
}
input.close();
bos.close();
byte[] result = bos.toByteArray();
return(result);
}
} catch(Exception e) {
}
return(null);
}
Hi Jeff,
Hi Tahir,
I attached the sample html (sample.zip) to the original post. Obviously, you will need to change the image URLs to something that works in your environment. The image files are in the DOCX.
Jeff
Hi Jeff,
How could the base URL be an issue? The images are being loaded via a callback. Please look at the code again. I load the document the same way whether creating a DOCX or a PDF. When saving as DOCX, the images are there, and they are embedded in the DOCX. When saving as PDF, the images are not there. The images loaded via the callback should also be embedded in the PDF. Is this an evaluation version issue? Also note that my evaluation of this product includes evaluation of how good the support is, and so far I am not impressed.
Addendum: Problem solved. My image loading callback was not succeeding, yet the images were appearing in the docx document anyway. Weird. Even though the images were appearing in the docx, they were not exported to the pdf. Also weird. With the image loading callback fixed, images continue to appear in the docx, and now also appear in the pdf.
Hi Jeff,
<img height=“72” src="/images/aspose-logo.gif" alt="" width=“72”></img><o:p></o:p>