Thanks for that. My JUnit test code converts the email to mhtml and then passes the stream to a new Words Document. Before saving to PDF I iterate through any shapes (as per your suggestion) and if they are too big I try to scale them. The logging looks good, the only problem is that nothing seems to be different in the resulting PDF file. That is, it is the same as it was without any of the shrink to fit code. Perhaps there is something else I have to do or I am missing something.
convert() is called by my JUnit test with the mhtml InputStream and an outputType of “pdf”, which is converted into SaveFormat.PDF by translateToSaveFormat().
//--------------------------------------------------------------------------
/** Converts an office document to another type of document.
* @param is The office document.
* @param outputType The type to convert to: doc, docx, htm, html, odt, pdf, tif, tiff, txt.
* @return The converted document.
* @throws Exception */
public static byte[] convert(InputStream is, String outputType) throws Exception {
HtmlLoadOptions htmlLoadOpts = new HtmlLoadOptions();
// try to prevent very long wait times when it can't load embedded image links.
htmlLoadOpts.setWebRequestTimeout(5000); // milliseconds, default is 100 seconds
log.info("output type is {}", outputType);
Document doc = new Document(is, htmlLoadOpts);
int pageCount = doc.getPageCount();
log.info("Page count is {}", pageCount);
shrinkImageToFit(doc);
int saveFormat = translateToSaveFormat(outputType);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
doc.save(baos, saveFormat);
byte[] bytes = baos.toByteArray();
log.info("Byte count for {} document is {}", outputType, bytes.length);
return(bytes);
}
//--------------------------------------------------------------------------
private static void shrinkImageToFit(Document doc) {
PageSetup ps = doc.getFirstSection().getPageSetup();
double contentWidth = ps.getPageWidth() - (ps.getLeftMargin() + ps.getRightMargin());
double contentHeight = ps.getPageHeight() - (ps.getTopMargin() + ps.getBottomMargin());
log.info("Page height is {}, width is {}, content height is {}, width is {}",
ps.getPageHeight(), ps.getPageWidth(), contentHeight, contentWidth);
NodeCollection<Shape> shapes = doc.getChildNodes(NodeType.SHAPE, true);
for (Shape shape : (Iterable<Shape>) shapes) {
double imageHeight = shape.getHeight();
double imageWidth = shape.getWidth();
log.info("Image found, height is {}, width is {}, rotation is {}",
imageHeight, imageWidth, shape.getRotation());
double vScale = 1;
if (imageHeight > contentHeight) {
vScale = contentHeight / imageHeight;
}
double hScale = 1;
if (imageWidth > contentWidth) {
hScale = contentWidth / imageWidth;
}
double scale = Math.min(hScale, vScale);
double imageHeight2 = imageHeight *= scale;
double imageWidth2 = imageWidth *= scale;
log.info("hScale is {}, vScale is {}, scale is {}, new height is {}, width is {}",
hScale, vScale, scale, imageHeight2, imageWidth2);
try {
shape.setHeight(imageHeight2);
shape.setWidth(imageWidth2);
//shape.setHeight(50);
//shape.setWidth(50);
log.info("New height {} and width {} set", imageHeight2, imageWidth2);
}
catch (Exception e) {
log.warn("Unable to scale image to scale {}", scale);
log.warn("Unable to scale image to scale", e);
}
}
}
The logging output is:
Nov 11, 2020 2:35:33 PM com.mycomp.asposeutils.WordsConversion shrinkImageToFit
INFO: Page height is 792.0, width is 612.0, content height is 648.0, width is 468.0
Nov 11, 2020 2:35:33 PM com.mycomp.asposeutils.WordsConversion shrinkImageToFit
INFO: Image found, height is 1188.0000000000002, width is 1584.0, rotation is 0.0
Nov 11, 2020 2:35:33 PM com.mycomp.asposeutils.WordsConversion shrinkImageToFit
INFO: hScale is 0.29545454545454547, vScale is 0.5454545454545453, scale is 0.29545454545454547, new height is 351.00000000000006, width is 468.0
Nov 11, 2020 2:35:33 PM com.mycomp.asposeutils.WordsConversion shrinkImageToFit
INFO: New height 351.00000000000006 and width 468.0 set
Nov 11, 2020 2:35:33 PM com.mycomp.asposeutils.WordsConversion convert
INFO: Byte count for pdf document is 2058676
I tried with the (commented out) hard-coded 50 for height and width but this made no difference. I am using Words 20.10.