Resolution of images extracted from PDF files

Shiv · February 27, 2006, 6:44am

Is there are way to Extract images from PDF files such that the image is the original resolution?

Currently all images exstracted are 96dpi.

Shiv · February 27, 2006, 2:33pm

Is there someone that can answer my questions please?

GeorgieYuan · February 27, 2006, 5:21pm

Dear Shiv,

We could only provide this mode now.

Thank you.

forever · February 27, 2006, 6:54pm

I am not sure if I am correct. In PDF document, the resolution of image is relative to the image size. For example, for a 96 dpi image, if I scale the image to double size of the orginal, it becomes 48 dpi in the PDF. How do you know the image resolution in the PDF document?

Shiv · February 27, 2006, 11:14pm

I'm not sure that's how it works :) Dpi is not dimension. An images of the given height and width can be of different dpi. Further is one tried to increase the dpi of an image it would look quite crappy.

This is what I am doing:

I create a PDF document using Acrobat professional. This document is simply made up of one image which is a 300 dpi image.

Using Acrobat reader, I can then export all images. When I do this, I get back the orginal image and it in a 300 dpi image.

When I attempt to do the same using Aspose PDF kit, I get a 96 dpi image. I've tested other software (PDF specific software) that does this as well.

I hope this explains the problem and requirement better?

Shiv.

Shiv · February 27, 2006, 11:17pm

One more thing. When I export images using Acrobat Reader it knows the image format of the images as well. So If a PDF document were comprised of 1 JPEG, 2 PNG and 1 TIFF image and I were to export all images, I’ll get the images in the original image format.

forever · February 28, 2006, 12:19am

I checked the display setting of Acrobat (at "edit->perferences->page display) and found the system setting is 96 DPI. Maybe it have something to do with this?

forever · February 28, 2006, 12:22am

BTW, our developers are now working to improve the image extracting function. We will study further about the resolution problem too.

Shiv · February 28, 2006, 10:24pm

Tomy that's good news. I'd love to be able use your product. Unfortunately, unless we can extract images at the same resolution as they were when orginally embedded in PDF we can't use your product for what we need. We like the other capability that PDF kit can give us no doubt.

Would it be possible for you to let me know when this is done? Or if you need me to test it out for you I'd be glad to help.

Shiv.

forever · March 1, 2006, 12:22am

Dear Shiv,

Sorry I can't tell you when this will be available since we are now study this issue and we are not sure how to solve this problem.

Shiv · March 2, 2006, 11:44pm

Tommy,

Here is a little code snippet that I used to extract the image from the PDF file I sent you earlier. It works as expected. I get the original dpi as well as ImageFormat

BinaryReader reader = new BinaryReader(File.Open(@“C:\Documents and Settings\Shiv Kumar\My Documents\ScannedPage.pdf”, FileMode.Open, FileAccess.Read));

int i = 0;

while (i != 1375)

{

reader.ReadByte();

i++;

}

byte[] bytes = reader.ReadBytes(738805);

reader.Close();

MemoryStream stream = new MemoryStream(bytes);

try

{

Image img = Image.FromStream(stream);

try

{

ImageFormat format = img.RawFormat;

string fileExt = “”;

if (format.Equals(ImageFormat.Bmp))

fileExt = “bmp”;

else if (format.Equals(ImageFormat.Jpeg))

fileExt = “jpg”;

else if (format.Equals(ImageFormat.Png))

fileExt = “png”;

else if (format.Equals(ImageFormat.Gif))

fileExt = “gif”;

img.Save(@“C:\test.” + fileExt, format);

}

finally

{

img.Dispose();

}

}

finally

{

stream.Close();

}

forever · March 3, 2006, 2:26am

Thanks for your code. That might be great help to our developers.