Decreased image DPI when saving Word document to PDF file

Hi,

I noticed that during the “Word to PDF” conversion the DPI of the image is reduced by 1 point. This happens for PNG images when the compression is set to JPEG format (PdfSaveOptions.setImageCompression(PdfImageCompression.JPEG)). I prepared a code snippet that creates a PDF file of the image and then extracts it into a separate file:

package com.example;

import com.aspose.pdf.ImageType;
import com.aspose.pdf.facades.PdfExtractor;
import com.aspose.words.DocumentBuilder;
import com.aspose.words.PdfImageCompression;
import com.aspose.words.PdfSaveOptions;
import java.nio.file.Files;
import java.nio.file.Path;

public class PdfExample {

  public static void addImageToPdf(Path pdf, Path image, int width, int height) {
    try {
      DocumentBuilder builder = new DocumentBuilder();    
      builder.insertImage(loadImage(image), width, height);    
      builder.getDocument().save(pdf.toString(), pdfSaveOptions());
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
  }

  private static byte[] loadImage(Path image) {
    try {
      return Files.readAllBytes(image);
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
  }

  private static PdfSaveOptions pdfSaveOptions() {
    PdfSaveOptions options = new PdfSaveOptions();
    options.setImageCompression(PdfImageCompression.JPEG);
    options.setJpegQuality(100);
    options.getDownsampleOptions().setDownsampleImages(false);
    return options;
  }

  public static void extractImagesFromPdf(Path pdf, ImageType format) {
    try (PdfExtractor pdfExtractor = new PdfExtractor()) {
      pdfExtractor.bindPdf(pdf.toString());    
      pdfExtractor.extractImage();
      int imageIndex = 1;
      while (pdfExtractor.hasNextImage()) {
        pdfExtractor.getNextImage(resolveImageFilename(pdf.getParent(), format, imageIndex++));
      }
    }
  }

  private static String resolveImageFilename(Path folder, ImageType format, int imageIndex) {
    return folder.resolve("extracted-img-%s.%s".formatted(imageIndex, format.toString().toLowerCase())).toString();
  }
}

Execution for the attached (example.png) PNG image (600x360 - 300 DPI):

Image file: example.png (13.9 KB)

Execution example:

class PdfExampleTest {

  public static final Path IMAGE = image("example.png");

    @Test
    void sameSize() {
        Path pdf = pdfFile("same-size.pdf");    
        PdfExample.addImageToPdf(pdf, IMAGE, 600, 360);    
        PdfExample.extractImagesFromPdf(pdf, ImageType.getJpeg());
    }
}

gives us a PDF file with an image of decreased DPI:

Extracted image details: dpi-decreased.jpg (52.2 KB)

However, manual conversion from PNG to JPEG image seems to solve the issue:

package com.example;

import com.aspose.imaging.Image;
import com.aspose.imaging.imageoptions.JpegOptions;
import com.aspose.pdf.ImageType;
import com.aspose.pdf.facades.PdfExtractor;
import com.aspose.words.DocumentBuilder;
import com.aspose.words.PdfImageCompression;
import com.aspose.words.PdfSaveOptions;
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;

public class PdfExample {

  public static void addImageToPdf(Path pdf, Path image, int width, int height) {
    try {
      DocumentBuilder builder = new DocumentBuilder();
      builder.insertImage(loadImage(image), width, height);
      builder.getDocument().save(pdf.toString(), pdfSaveOptions());
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
  }

  private static byte[] loadImage(Path image) {
    try {
      return isPng(image) ? toJpeg(Files.newInputStream(image)) : Files.readAllBytes(image);
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
  }

  private static byte[] toJpeg(InputStream pngImage) {
    try (ByteArrayOutputStream output = new ByteArrayOutputStream()) {
      Image.load(pngImage)
          .save(output, jpgSaveOptions());
      return output.toByteArray();
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
  }

  private static boolean isPng(Path image) {
    return image.toString().matches(".+\\.%s".formatted(ImageType.getPng().toString().toLowerCase()));
  }

  private static JpegOptions jpgSaveOptions() {
    JpegOptions options = new JpegOptions();
    options.setQuality(100);
    return options;
  }

  private static PdfSaveOptions pdfSaveOptions() {
    PdfSaveOptions options = new PdfSaveOptions();
    options.setImageCompression(PdfImageCompression.JPEG);
    options.setJpegQuality(100);
    options.getDownsampleOptions().setDownsampleImages(false);
    return options;
  }

  public static void extractImagesFromPdf(Path pdf, ImageType format) {
    try (PdfExtractor pdfExtractor = new PdfExtractor()) {
      pdfExtractor.bindPdf(pdf.toString());
      pdfExtractor.extractImage();
      int imageIndex = 1;
      while (pdfExtractor.hasNextImage()) {
        pdfExtractor.getNextImage(resolveImageFilename(pdf.getParent(), format, imageIndex++));    
      }
    }
  }

  private static String resolveImageFilename(Path folder, ImageType format, int imageIndex) {
    return folder.resolve("extracted-img-%s.%s".formatted(imageIndex, format.toString().toLowerCase())).toString();
  }
}

Extracted image details: dpi-same.png (62.4 KB)

Is there a way to achieve the same results without manual conversion?

@ANDREA.FARRIS

This issue has been logged as WORDSJAVA-2893. We will keep you posted and will let you know as soon as the issue is resolved. Please accept our apologies for the inconvenience.

@ANDREA.FARRIS,

We, along with the Aspose.Words for Java and Aspose.Words for .NET teams, have jointly reviewed this issue and have determined that the issue is not specific to the Aspose.Words products.

I’m moving your topic to the Aspose.PDF forum so that my colleagues from the Aspose.PDF team can help you.

Please accept our apologies for the inconvenience.

@ANDREA.FARRIS
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFJAVA-43336

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

@ANDREA.FARRIS
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-26258

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

The issues you have found earlier (filed as WORDSNET-26258) have been fixed in this Aspose.Words for .NET 23.12 update also available on NuGet.

Hello,

After testing aspose.words 23.12 version for JAVA issue is not fixed, images have the same problem with DPI

@ANDREA.FARRIS

We will be checking it from Aspose.Words for Java perspective and sharing the feedback with you shortly.

@ANDREA.FARRIS The issue has been closed as Not a Bug in Aspose.Words. We thoroughly analyzed the problem and came to the conclusion that Aspose.Words rendering works as expected. We tracked the image from upload to conversion, and it (the image) does not change when written to the PDF stream.
If use some third party tool (for example I used iText RUPS - 7.2.5) to view the raw bytes of the image inside the resulting PDF, it is seen that the image resolution has not changed (i.e. 300 dpi)

Everything points to the source of the problem being within PdfExtractor (Aspose.PDF) and not within Aspose.Words.

1 Like