How to get the size of image files embedded in a PDF file

How can we (programmatically, in Java) get the size of each image file embedded in a PDF file without actually extracting the whole image to memory/disk? We want to check in case it's too large to handle.


Many thanks,

Joe

Hi Joe,


Thanks for contacting support.

In order to accomplish your requirement, please try using following code snippet.

[Java]

// Load the source PDF document<o:p></o:p>

com.aspose.pdf.Document doc = new com.aspose.pdf.Document("c:\\input.pdf");

com.aspose.pdf.ImagePlacementAbsorber abs = new com.aspose.pdf.ImagePlacementAbsorber();

// Load the contents of first page

doc.getPages().get_Item(1).accept(abs);

for(int counter=1; counter <= abs.getImagePlacements().size(); counter++)

{

com.aspose.pdf.ImagePlacement imagePlacement = abs.getImagePlacements().get_Item(counter);

System.out.println("Image # = " +counter );

// Get image properties

System.out.println("image width:" + imagePlacement.getRectangle().getWidth());

System.out.println("image height:" + imagePlacement.getRectangle().getHeight());

System.out.println("image LLX:" + imagePlacement.getRectangle().getLLX());

System.out.println("image LLY:" + imagePlacement.getRectangle().getLLY());

System.out.println("image horizontal resolution:" + imagePlacement.getResolution().getX());

System.out.println("image vertical resolution:" + imagePlacement.getResolution().getY());

System.out.println("==================================================");

}

Thanks for the help! Yet, I'm not sure how you can use these numbers to calculate the image size?

I called PdfExtractor.getNextImage to get the sizes of the first e.g. 3 images (using the default format .jpg) as below:

6,541 bytes

31,239 bytes

49,386 bytes

And I used your code to get the corresponding image metadata:

analysing page1 image1

image width:612.0

image height:131.0399932861328

image LLX:0.0

image LLY:660.9599609375

image horizontal resolution:150

image vertical resolution:150

analysing page1 image2

image width:612.0

image height:131.0399932861328

image LLX:0.0

image LLY:529.9199829101562

image horizontal resolution:150

image vertical resolution:150

analysing page1 image3

image width:612.0

image height:131.0399932861328

image LLX:0.0

image LLY:398.8800048828125

image horizontal resolution:150

image vertical resolution:150

The numbers don't appear to add up. How could I use these numbers to calculate the image size?


Thanks,

Joe

Hi Joe,


Thanks for contacting support.

From Image size, do you mean getting image dimensions or image file size ? Please note that earlier shared code snippet returns image dimensions inside PDF file. The height and width properties use points as the basic unit, where 1 inch = 72 points and 1 cm = 1/2.54 inch = 0.3937 inch = 28.3 points. Furthermore, the conversion from point to pixel depends on an image’s DPI (dots per inch) property. For example, if an image’s DPI is 96 (96 pixels for each inch), and it is 100 points high, its height in pixels is (100 / 72) * 96 = 133.3. The general formula is: pixels = ( points / 72 ) * DPI.

In case you still face any issue, please share the resource file, so that we can test the scenario in our environment.

Hi Nayyer,


Thanks for the reply! Yet, I meant the image file size…

Regards,
Joe

Hi Joe,


Thanks for sharing the details.

I am afraid currently Aspose.Pdf for Java does not support the feature to get size of images inside PDF file. However for the sake of implementation, we already have logged this requirement as PDFNEWJAVA-35301 under New Features list. We will further look into the details of this
problem and will keep you updated on the status of correction. Please be
patient and spare us little time. We are sorry for this inconvenience.

Thanks very much for the confirmation! Do you know roughly when this feature would be available? Or is there a workaround for now to work out the image file size based upon the dimensions?


Regards,
Joe

Hi Joe,


Thanks for your inquiry. I am afraid your issue is still not resolved as we have noticed the issue recently and still it is pending for investigation in the queue with other issues, reported earlier. We can not share an ETA unless investigation of the issue is completed.

Furthermore about workaround, after completion the initial investigation we will let you know if we can suggest you any workaround. We will notify you as soon as we made some significant progress towards issue resolution.

We are sorry for the inconvenience.

Best Regards,