Different image quality when running from docker

I’m doing a pdf to tiff conversion for one-page pdf files which contain a scanned document.
I’m running a test function from Intellij in a Windows 7 env. and the output tiff has a decent quality, the hand-written words are readable.

For the same file and the same settings, when the output tiff is produced from a docker container deployed in a Linux env., the quality is very low. The words are more “pixelated” and there are white pixels where there should be black.

I’m using the FROM openjdk:11 inside the Dockerfile.

Please, reply as soon as possible

@stsakas

Are you using the latest version of the API i.e. 22.5. Can you please share the sample docker file for our reference along with the sample input and output files? Also, please share the sample code snippet that you are using. We will test the scenario in our environment and address it accordingly.

Hi,
i’m using a spring boot/maven project. This is a part of my pom.xml file

		<dependency>
			<groupId>com.aspose</groupId>
			<artifactId>aspose-total</artifactId>
			<version>21.10</version>
			<type>pom</type>
		</dependency>
		<dependency>
			<groupId>javax.media</groupId>
			<artifactId>jai-core</artifactId>
			<version>1.1.3</version>
		</dependency>
		<dependency>
			<groupId>javax.media</groupId>
			<artifactId>jai_imageio</artifactId>
			<version>1.1.1</version>
		</dependency>
		<dependency>
			<groupId>com.sun.media</groupId>
			<artifactId>jai-codec</artifactId>
			<version>1.1.3</version>
		</dependency>
	</dependencies>

	<repositories>
		<repository>
			<id>AsposeJavaAPI</id>
			<name>Aspose Java API</name>
			<url>https://repository.aspose.com/repo/</url>
		</repository>
		<repository>
			<id>Jboss</id>
			<name>Jboss repository</name>
			<url>https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/</url>
		</repository>
		<repository>
			<id>Geotoolkit</id>
			<name>Geotoolkit repository</name>
			<url>http://maven.geotoolkit.org/</url>
		</repository>
	</repositories>

As you can see this is not the 22.5 version. I changed the version and for some reason i could not download some microsoft dependencies. So i changed the pom.xml file to this:

	<dependency>
		<groupId>com.aspose</groupId>
		<artifactId>aspose-total</artifactId>
		<version>22.5</version>
		<type>pom</type>
	</dependency>
	<dependency>
		<groupId>com.microsoft.onnxruntime</groupId>
		<artifactId>onnxruntime</artifactId>
		<version>1.11.0</version>
	</dependency>
	<dependency>
		<groupId>com.microsoft.onnxruntime</groupId>
		<artifactId>onnxruntime_gpu</artifactId>
		<version>1.7.0</version>
	</dependency>
	<dependency>
		<groupId>javax.media</groupId>
		<artifactId>jai-core</artifactId>
		<version>1.1.3</version>
	</dependency>
	<dependency>
		<groupId>javax.media</groupId>
		<artifactId>jai_imageio</artifactId>
		<version>1.1.1</version>
	</dependency>
	<dependency>
		<groupId>com.sun.media</groupId>
		<artifactId>jai-codec</artifactId>
		<version>1.1.3</version>
	</dependency>
</dependencies>

<repositories>
	<repository>
		<id>central</id>
		<name>Maven Repository Switchboard</name>
		<url>https://repo1.maven.org/maven2</url>
	</repository>
	<repository>
		<id>AsposeJavaAPI</id>
		<name>Aspose Java API</name>
		<url>https://repository.aspose.com/repo/</url>
	</repository>
	<repository>
		<id>Jboss</id>
		<name>Jboss repository</name>
		<url>https://repository.jboss.org/nexus/content/repositories/thirdparty-releases/</url>
	</repository>
	<repository>
		<id>Geotoolkit</id>
		<name>Geotoolkit repository</name>
		<url>http://maven.geotoolkit.org/</url>
	</repository>
</repositories>

Now the problem is even worse. The test that i run locally is still producing the tiff as i want it but when i create the jar file (mvn package) and deploy it in our VM using docker, the output tiff file is a corrupted zero byte file.

This is the Dockerfile:

FROM openjdk:11

RUN apt-get upgrade -y --fix-missing && \
    apt-get update -y

EXPOSE 8080

ARG JAR_FILE=target/asposeconverter-0.0.1-SNAPSHOT.jar

ADD ${JAR_FILE} asposeconverter-0.0.1-SNAPSHOT.jar

ENTRYPOINT ["java", "-Xms6G", "-Xmx6G", "-jar", "/asposeconverter-0.0.1-SNAPSHOT.jar"]

Using spring, maven or docker as developing tools is a common practice nowadays. Why Aspose doesn’t provide a demo project of these three that is tested and should work as expected?

Here is the test that i’m running locally:

    @Test
    void convertPdfToTiffCODITest() throws Exception {
        String inputFolderPath = Path.of(packagePath, "samplefiles", "pdf").toString();
        String outputFolderPath = Path.of(packagePath, "outputfiles", "pdf").toString();

        // create output folder if it doesn't exist
        Files.createDirectories(Paths.get(outputFolderPath));

        // get pdf name/absolute-paths
        Map<String, String> filenamePathMap = Stream.of(new File(inputFolderPath).listFiles())
                .filter(file -> !file.isDirectory() && file.getName().endsWith("codi.pdf"))
                .collect(Collectors.toMap(File::getName, File::getAbsolutePath));

        TiffSettings tiffSettings = new TiffSettings();
        tiffSettings.setCompression(CompressionType.LZW);
        tiffSettings.setDepth(ColorDepth.Format4bpp);
        Resolution resolution = new Resolution(200);
        TiffDevice tiffDevice = new TiffDevice(resolution, tiffSettings);

        for (String filename: filenamePathMap.keySet()) {
            log.info("test: Converting " + filename + " to tiff");
            long startTime = System.currentTimeMillis();

            com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document(filenamePathMap.get(filename));

            String convertedFilename = filename.substring(0, filename.lastIndexOf('.')) + ".tiff";
            String outputFilePath = outputFolderPath + File.separator + convertedFilename;

            java.io.OutputStream imageStream = new java.io.FileOutputStream(outputFilePath);

            tiffDevice.process(pdfDocument, imageStream);

            imageStream.close();

            log.info("test: Conversion finished! Time: " + ((System.currentTimeMillis() - startTime)/1000) + " secords");
            log.info("test: " + outputFilePath);
        }
    }

Here is the function that is running in the docker app:

    public void runConverter() throws Exception {

        String outputAbsolutePath = getOutputAbsolutePath();
        String inputAbsolutePath = getInputAbsolutePath();
        Integer[] pagesToKeep = getPagesToKeep();

        String filename = FilenameUtils.getName(inputAbsolutePath);

        // tiff settings
        TiffSettings tiffSettings = new TiffSettings();
        Resolution resolution;
        if (isHighQuality()) {
            tiffSettings.setCompression(CompressionType.LZW);
            resolution = new Resolution(200);
        } else {
            tiffSettings.setCompression(CompressionType.CCITT4);
            resolution = new Resolution(300);
        }
        tiffSettings.setDepth(ColorDepth.Format4bpp);
        TiffDevice tiffDevice = new TiffDevice(resolution, tiffSettings);

        log.info("Converting " + filename + " to tiff");
        long startTime = System.currentTimeMillis();

        Document pdfDocument = new Document(inputAbsolutePath);
        Document tempPdfDocument = null;

        // check if a page number exceeds available pages
        if (pagesToKeep != null && pagesToKeep.length > 0 && Arrays.stream(pagesToKeep).max(Integer::compare).get() > pdfDocument.getPages().size())
            throw new InvalidRequestException("Invalid page number was given");

        // Since a Java heap memory error is possible, we use a temporary filename and if the conversion
        // is finished without the error, we then rename the file to the original requested name
        String tempFilename = "corrupted-" + UUID.randomUUID() + ".tiff";
        Path tempOutputAbsolutePath = Path.of(FilenameUtils.getPath(outputAbsolutePath), tempFilename);

        try (OutputStream imageStream = new FileOutputStream(tempOutputAbsolutePath.toString())) {
            if (pagesToKeep == null || pagesToKeep.length == 0) {
                tiffDevice.process(pdfDocument, imageStream);
            } else {
                tempPdfDocument = new Document();

                for (int i: pagesToKeep)
                    tempPdfDocument.getPages().add(pdfDocument.getPages().get_Item(i));

                tiffDevice.process(tempPdfDocument, imageStream);
            }
        } finally {
            pdfDocument.close();
            if (tempPdfDocument != null)
                tempPdfDocument.close();
        }

        Files.move(tempOutputAbsolutePath, tempOutputAbsolutePath.resolveSibling(FilenameUtils.getName(outputAbsolutePath)), StandardCopyOption.REPLACE_EXISTING);

        log.info("Conversion of " + filename + " to tiff finished in " +
                ((System.currentTimeMillis() - startTime) / 1000) + " seconds");
    }

Of course i expect the same result if isHighQuality() returns true.

@stsakas

As requested earlier, could you please also share the sample input/output files? We will log an investigation ticket and share the ID with you.

I made a demo project for you. Please check it on your environment and let me know as soon as possible if you can reproduce this situation.

In the demo project there is a duplicate file, one inside the test-files folder which will be used from the docker container and one inside the asposeasposeconverter-wth-test.zip (6.1 MB)
converter\src\test\java\eu\icap\asposeconverter\samplefiles\pdf folder which will be used by the test method.

  1. Open the project in the intellij, maven->reload project to download the dependencies listed in the pom.xml
  2. Run the the convertPdfToTiffCODITest method. This will produce the output tiff file inside the asposeconverter\src\test\java\eu\icap\asposeconverter\output\pdf folder.
  3. run mvn clean package to create the jar and then docker compose up to execute the code in the application class. This will produce the output tiff inside the test-files folder.
  4. Run docker compose down --rmi all to remove the container and the image.

Now, please compare these two tiff files. The one generated inside docker should be missing too many details, they are different. Take a look at the code just in case this is not a bug and im doing something wrong.

EDIT: I couldn’t upload the zip with both pdf files inside (“file too big”). Please
manually copy-paste the pdf file from test-files folder to the asposeconverter\src\test\java\eu\icap\asposeconverter\output\pdf folder.

@stsakas

Thanks for providing all these details. We have logged an investigation ticket as PDFJAVA-41706 in our issue management system to further analyze this case. We will look into its details and keep you posted with the status of ticket resolution. Please be patient and spare us some time.

We apologize for your inconvenience.

Hi,
could you please provide an estimation of how long will it take to investigate the issue?

@stsakas

The ticket has recently been logged in our issue management system and we will investigate and resolve it on a first come first serve basis. However, we have recorded your concerns and will surely inform you as soon as we have some news about its resolution or ETA. Please spare us little time.

We apologize for the inconvenience.

The issues you have found earlier (filed as PDFJAVA-41706) have been fixed in Aspose.PDF for Java 22.6.

Hi,
i updated the pom.xml to

	<dependency>
		<groupId>com.aspose</groupId>
		<artifactId>aspose-total</artifactId>
		<version>22.6</version>
		<type>pom</type>
	</dependency>

After reloading the project i get the following error:

Unresolved dependency: 'com.aspose:aspose-diagram:jar:22.6'

@stsakas

Would you please make sure that the Aspose Repository is specified correctly in the pom.xml like below:

 <repositories>
    <repository>
        <id>AsposeJavaAPI</id>
        <name>Aspose Java API</name>
        <url>https://repository.aspose.com/repo/</url>
    </repository>
</repositories>

You can also verify that the repository URL contains https:// instead of http://. In case you still face any issues, please feel free to let us know.

I already had the repository defined as you said. Still, i get the error i posted.
Please, let me know for any updates on this issue.

@stsakas

Would you please create a simple console application and define the dependencies in it and see if the issue still occurs. Please share that console application with us so that we may try to replicate the issue in our environment as well and address it accordingly. Also, please let us know about the last version that was working for you.

Hi again,

i believe the demo (asposeconverter-wth-test.zip) that i sent you in a reply above is simple enough and this is what i am using.

As you can see the version in this demo is

	<dependency>
		<groupId>com.aspose</groupId>
		<artifactId>aspose-total</artifactId>
		<version>21.10</version>
		<type>pom</type>
	</dependency>

and it works fine on my machine.

EDIT: The 22.5 version has no issues too.

@stsakas

We are checking it and will get back to you shortly.

Hi,
do you have any updates on the issue?

@aspose.notifier

We have investigated the issue and found that it is causing due to the line “ <classifier>jdk16</classifier>” in aspose-total pom.xml inside Aspose.Diagram Dependency. When deleting this line, APIs are getting downloaded without any issue. We are updating the aspose-total pom.xml file and will let you know once the change is done. We apologize for the inconvenience.

Hi again,
is it possible to estimate when the bug will be fixed?

@stsakas

We would like to share with you that the aspose-total pom.xml has been fixed now.