Bug in Aspose.Slides for Java - Parallel Image Export Runs into Errors

Dear Aspose Support Team,

We have encountered a critical bug in the Aspose.Slides for Java library, specifically related to exporting slide images in parallel. Below are the details of the issue, along with the necessary information for your review:

Reproduction Steps:

  1. Example presentation with couple of slides form a official PowerPoint template (original.pptx): Presentation1.zip (277.3 KB)

  2. Code triggering the bug:

@Test
void parallelImageGeneration() throws FileNotFoundException {

	Presentation presentation = new Presentation(
			new FileInputStream(
					new File(
							"/path/to/Presentation1.pptx")));

	var scales = new Integer[] { 1, 2, 4 };
	this.failcount = 0;

	for (var scale : scales) {
		IntStream.range(0, presentation.getSlides().size()).parallel().forEach(i -> {
			var slide = presentation.getSlides().get_Item(i);
			try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
				var img = slide.getImage(scale, scale);
				img.save(baos, ImageFormat.Jpeg);
				var bytes = baos.toByteArray();
				// do something with bytes....
				logger.info("successfully generated image for slide {} with scale {}", slide.getSlideNumber(), scale);
			} catch (Exception e) {
				failcount++;
				logger.info("failed generating image for slide {} with scale {}", slide.getSlideNumber(), scale);
				e.printStackTrace();
			}
		});
	}

	logger.info("fail count: {}", this.failcount);

}

Issue Description:

We need to generate images of each slide in a presentation in at least three different scales. This process is time-consuming, and generating the images in parallel would significantly enhance our efficiency and user experience.

However, when attempting to use parallel streams, we encounter internal errors with Aspose (see attached error log: error.log.zip (1.9 KB)

Questions:

  1. Is there an alternative approach to speed up image generation?
  2. Is parallel image generation supported, or is it a feature that might be added in the future? Maybe expose a method in IPresentation to efficiently generate all images for a given scale and format.

Environment:

  • Aspose.Slides version: 24.6
  • Java version: 17
  • Operating Systems: macOS 14.5 & Windows 11 & Ubuntu 22.04

Your prompt attention to this matter is highly appreciated. Please let us know if any additional information is required to investigate and resolve this issue.

Thank you for your assistance.

Best regards,

Justin Voitel
Domino informatics GmbH

An approximate time frame when the error will be fixed would be helpful!

@justinvo,
Thank you for contacting support. We are sorry that you have to encounter this problem.

Unfortunately, while parallel work with presentations is possible (besides parsing/loading/cloning) and everything goes well (most times), there is a small chance you might get incorrect results when you use the library in multiple threads.

We strongly recommend that you do not use a single Presentation instance in a multi-threading environment because it might result in unpredictable errors or failures that are not easily detected.

Multithreading in Aspose.Slides|Aspose.Slides Documentation

I think you can try to clone presentation slides to different Presentation objects and use them to render images separately but I don’t have evaluated this approach.

Thank you for your answer!

I tried to use this approach but Iam getting a different Error when adding the clone to a fresh Presentation object:

@Test
void parallelImageGeneration() throws FileNotFoundException {
	Presentation presentation = new Presentation(
			new FileInputStream(
					new File(
							"/path/to/presentation.pptx")));

	var scales = new Integer[] { 1, 2, 4 };
	this.failcount = 0;
	var start = System.nanoTime();

	for (var scale : scales) {
		IntStream.range(0, presentation.getSlides().size()).parallel().forEach(i -> {
			var srcSlide = presentation.getSlides().get_Item(i);

			var tempPres = new Presentation();
			var slide =  tempPres.getSlides().addClone(srcSlide);

			try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
				var img = slide.getImage(scale, scale);
				img.save(baos, ImageFormat.Jpeg);
				var bytes = baos.toByteArray();
				// do something with bytes....
				logger.info("successfully generated image for slide {} with scale {}", srcSlide.getSlideNumber(),
						scale);
			} catch (Exception e) {
				failcount++;
				logger.info("failed generating image for slide {} with scale {}", srcSlide.getSlideNumber(), scale);
				e.printStackTrace();
			} finally {
				tempPres.dispose();
			}
		});
	}
	logger.info("Duration: {}s", TimeUnit.NANOSECONDS.toSeconds(System.nanoTime() - start));

	logger.info("fail count: {}", this.failcount);
}

The error Iam getting:

java.lang.NullPointerException
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:564)
at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:591)
at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:689)
at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfInt.evaluateParallel(ForEachOps.java:188)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
at java.base/java.util.stream.IntPipeline.forEach(IntPipeline.java:463)
at java.base/java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:620)
at de.dominoinformatics.pptslides.AsposeTest.parallelImageGeneration(AsposeTest.java:1320)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
Caused by: java.lang.NullPointerException: Cannot invoke “com.aspose.slides.uf1.b0()” because “[local4]” is null
at com.aspose.slides.arz.b0(Unknown Source)
at com.aspose.slides.sa.b0(Unknown Source)
at com.aspose.slides.kl.b0(Unknown Source)
at com.aspose.slides.wxk.b0(Unknown Source)
at com.aspose.slides.lg8.b0(Unknown Source)
at com.aspose.slides.mlr.b0(Unknown Source)
at com.aspose.slides.mlr.b0(Unknown Source)
at com.aspose.slides.hxg.b0(Unknown Source)
at com.aspose.slides.MasterSlideCollection.b0(Unknown Source)
at com.aspose.slides.MasterSlideCollection.b0(Unknown Source)
at com.aspose.slides.SlideCollection.addClone(Unknown Source)
at de.dominoinformatics.pptslides.AsposeTest.lambda$2(AsposeTest.java:1324)
at java.base/java.util.stream.ForEachOps$ForEachOp$OfInt.accept(ForEachOps.java:204)
at java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:104)
at java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:711)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
at java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)
at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:373)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1182)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1655)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1622)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:165)

Do you have any Idea what I can do ?

@justinvo,
I am working on the issue and will get back to you soon.

@justinvo,
Thank you for your patience. Please note that the Presentation object from the presentation variable is still used in different threads in your code. Please try using the following approach:

var scale = 2;

Presentation presentation = new Presentation("sample.pptx");
var slideCount = presentation.getSlides().size();
var slideSize = presentation.getSlideSize().getSize();
var slideWidth = (float)slideSize.getWidth();
var slideHeight = (float)slideSize.getHeight();

var slidePresentations = new ArrayList<Presentation>(slideCount);

// Split presentation slides into separate presentations
for (var slide : presentation.getSlides()) {
    var slidePresentation = new Presentation();
    slidePresentation.getSlideSize().setSize(slideWidth, slideHeight, SlideSizeScaleType.DoNotScale);
    slidePresentation.getSlides().removeAt(0);
    slidePresentation.getSlides().addClone(slide);
    slidePresentations.add(slidePresentation);
}

IntStream.range(0, slideCount).parallel().forEach(i -> {
    var slide = slidePresentations.get(i).getSlides().get_Item(0);

    try (var baos = new ByteArrayOutputStream()) {
        var image = slide.getImage(scale, scale);
        image.save(baos, ImageFormat.Jpeg);
        var bytes = baos.toByteArray();
        // ...
        System.out.println("Successfully generated image.");
    } catch (Exception e) {
        System.out.println("Failed generating image.");
    } finally {
        // ...
    }
});

for (var slidePresentation : slidePresentations) {
    slidePresentation.dispose();
}
presentation.dispose();

The above code example uses only one scale value, but you can improve this example for different image scales.

@andrey.potapov
Thank you for your help!

I tried it out and it is much faster without any errors.
Maybe this topic would be interesting for the documentation :slight_smile:

@justinvo,
Thank you for your feedback. I think this is a good idea.