Facing issue in Extract images from PDF


We tried to extract image from PDF. If some Images integrated multiple images(Input.pdf) in this case we expected whole image as a single image but we are getting individual images.

How can we solve this problem?Input.pdf (1.3 MB)
output.jpg (6.3 KB)


Please note that you cannot extract multiple images as an unified image because they are stored separately in resources collection of PDF document. However, as a workaround, you can capture particular region of PDF Page by specifying rectangle of largest/parent image and save it as an image. Please check following code snippet where a workaround is applied for your particular document. An output image is also attached for your reference.

// Open document
Document document = new Document(dataDir + "InputMultipleImages.pdf");
ImagePlacementAbsorber imagePlacementAbsorber = new ImagePlacementAbsorber();
Rectangle imageRectangle = imagePlacementAbsorber.ImagePlacements[1].Rectangle;
// Get rectangle of particular page region
Aspose.Pdf.Rectangle pageRect = new Aspose.Pdf.Rectangle(imageRectangle.LLX, imageRectangle.LLY, imageRectangle.URX + 10, imageRectangle.URY + 10);
// Set CropBox value as per rectangle of desired page region
document.Pages[1].CropBox = pageRect;
// Save cropped document into stream
MemoryStream ms = new MemoryStream();
// Open cropped PDF document and convert to image
document = new Document(ms);
// Create Resolution object
Resolution resolution = new Resolution(300);
// Create PNG device with specified attributes
PngDevice pngDevice = new PngDevice(resolution);
dataDir = dataDir + "ConvertPageRegionToDOM_out.png";
// Convert a particular page and save the image to stream
pngDevice.Process(document.Pages[1], dataDir);

ConvertPageRegionToDOM_out.jpg (101.2 KB)

In case of further assistance, please feel free to let us know.