I’ve extracted the image from document(.docx). An extracted pdf image using aspose resolution was very low(around 200dbi), l manually converted Docx to pdf using acrobat pro, and the image resolution was high(around 700 dbi) . Please help me, with how to retain the resolution using aspose extraction.
@Mahesh39 Upon saving to PDF Aspose.Words downsamples the images by default to reduce the output document size. You can disable downsampling using PdfSaveOptions.DownsampleOptions. See the following code example:
Document doc = new Document("C:\\Temp\\in.docx");
PdfSaveOptions options = new PdfSaveOptions();
options.getDownsampleOptions().setDownsampleImages(false);
doc.save("C:\\Temp\\out.pdf", options);
@Mahesh39 Do I understand correctly that you want to retain the resolution value stored in the image file when saving document to the PDF? Unfortunately when using PdfImageCompression.AUTO most of the images are stored in the PDF file with /Flate PDF compression algorithm. In this case only raw image data is stored and compressed and all additional information like resolution is lost. This is peculiarities of the PDF format. You could try to use PdfImageCompression.JPEG. In this case images will be stored in PDF document in JPEG format with additional information. Resolution value should be retained this way.
P.S. In all input documents provided by you the TIFF images have resolution of 220dpi.
@Mahesh39 To retain resolution using PdfImageCompression.JPEG you could use following code:
Document doc = new Document("C:\\Temp\\in.docx");
PdfSaveOptions options = new PdfSaveOptions();
options.getDownsampleOptions().setDownsampleImages(false);
opt.setImageCompression(PdfImageCompression.JPEG);
doc.save("C:\\Temp\\out.pdf", options);
To get the image format manually you could unzip the docx file and check the word\media folder.
To get the image format programmatically you could use following code:
@Mahesh39 If the image resolution is important in your case then it will be better if you perform the TIFF->JPEG conversion in AW DOM by yourself to be sure. Then the JPEG image data will be stored in the output PDF as is. You could get the image bytes with the Shape.getImageData().getImageBytes() method and set the converted image byte with the Shape.getImageData().setImageBytes() method.