How to render Page to BufferedImage


#1

I’m trying out Aspose.PDF for Java as an alternative for PDFBox, as it could not handle some documents. I need to render single pages as images. I need them as a BufferedImage for some other API to process images I generate from PDF files. Basically I need to input a PDF file and get a list of scaled buffered images for further processing.
I can do something like this:

var out = new ByteArrayOutputStream(buff);
final PngDevice device = new PngDevice(new Resolution(dpi));
device.process(page, out);

Then I can use ImageIO.read to get a BufferedImage.
But can’t I just render a page directly as a BufferedImage or to a Graphics2D from an image I create?

PDFBox can do it like this:

PDFRenderer renderer = new PDFRenderer(doc);
renderer.renderPageToGraphics(index, graphics2d);

This also makes scaling easy as I can use the scale method of Graphics2D.

Am I missing something? Maybe there’s a “Device” for that, but I only see those for PNG, JPG etc.

Another problem with using PngDevice is that when I then load it as an com.aspose.pdf.Image I can’t get the dimensions. getBufferedImage returns null. Why is that? If I could get the dimensions I could use it to do the opposite. Sometimes we need to create a PDF file from images.

Claude


#2

@vegan

Thank you for contacting support.

We are afraid such an overload of Process method may not be available. Therefore, a ticket with ID PDFJAVA-38865 has been logged in our issue management system for further investigations. Moreover, please elaborate with narrowed down code about loading as image and returning null.


#3

We mostly convert PDF to images. But we also archive documents for a system that only accepts PDF. And when we split PDF to separate pages we sometimes need to rescale pages that are way too large. Some people take a picture with their SLR camera and we can’t process a page that is 30MB. So we want to render that site, scale and compress the image, put in onto a new page (A4) and save it.

So we need to convert images (i.e. JPG / rendered pages) to PDF documents with a single page, which contains nothing but that image. We need to scale and compress the image because only A4 pages are accepted and the file size can’t be too large.
Here’s the code I tried that gives me a null reference for getBufferedImage:

import com.aspose.pdf.*;
import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
// ...
public byte[] JPG2PDF(final byte[] data) {
    final Image jpg = new com.aspose.pdf.Image();
    ByteArrayInputStream inputStream = new ByteArrayInputStream(data);
    jpg.setImageStream(inputStream);
    final BufferedImage bufferedImage = jpg.getBufferedImage();
    final int width = bufferedImage.getWidth(); // NPE!
    // ...

Image is probably just a proxy to whatever is actually put into the document. Maybe getBufferedImage would return something once it’s actually used. I haven’t put much effort in researching this. There’s probably a better way to convert a JPG image file to a PDF document.

Another difference from PDFBox is that DocumentInfo doesn’t have setProducer and setCreator methods, which would be useful to set the metadata of generated documents. Maybe setCreator is filled automatically, which would be fine. And we could just use setAuthor instead of setProducer.


#4

@vegan

Would you please try below code snippet and then share your kind feedback with us.

java.awt.image.BufferedImage readImage = null;
try {
	readImage = ImageIO.read(new File(dataDir + "Fig_1.jpeg"));
	int h = readImage.getHeight();
	int w = readImage.getWidth();
		
	Document doc = new Document();
	Page page = doc.getPages().add();
	com.aspose.pdf.Image image = new com.aspose.pdf.Image();
	image.setFile(dataDir + "Fig_1.jpeg");
	//page.getPageInfo().setHeight(h);
        //page.getPageInfo().setWidth(w);
        page.getPageInfo().setHeight(com.aspose.pdf.PageSize.getA4().getHeight());
	page.getPageInfo().setWidth(com.aspose.pdf.PageSize.getA4().getWidth());
	page.getPageInfo().getMargin().setBottom(0);
	page.getPageInfo().getMargin().setTop(0);
	page.getPageInfo().getMargin().setRight(0);
	page.getPageInfo().getMargin().setLeft(0);
	page.getParagraphs().add(image);
	doc.save(dataDir + "ImagetoPDF.pdf");
} catch (Exception e) {
	readImage = null;
}

Moreover, we are afraid you may not be able to set Creator and Producer fields. For other metadata, please visit PDF File Metadata for your kind reference.


#5

@vegan

We would also suggest you that you can use below approach as well. Please feel free to contact us if you need any further assistance.

...
final PngDevice device = new PngDevice(new Resolution(dpi));
BufferedImage bfImage = device.processToBufferedImage(page);
...