Resources needed by AsposeSlideJava

Hi,

I’m running a “split” operation through the Ruby RJB with the split_and_merge script you guys sent me. The goal is to split a powerpoint file into individual files.

It works fine with most of small files (less than 50 megabytes). I’ve just tried with a 140mb file of 127 slides, and the script processes about 12 slides, and then freezes. You can see the screenshot of the container running it here, at 97% CPU and 1.5G of memory (when before starting the script it was 0% CPU, 0.1G memory)

Screenshot 2020-09-28 at 15.56.44.png (20.8 KB)

I’m in the very last steps of implementing Aspose, but I need to be able to process bigger files than this (up to 500Mb would be nice).
So my question is : do you have any idea why splitting a 120mb powerpoint file would required 1.5+Gb of RAM and freeze the CPU to 100% after 12 slides ?

Thanks

EDIT : I let the freeze go for a few more minutes and ended up with a Heap OOM Exception, meaning that aspose required more than 2Gb of ram to split a 130mb file… is this normal?!
Screenshot 2020-09-28 at 16.05.50.png (361.5 KB)

@albandum

You are performing a computationally very resource eating process in terms of both memory and processing. You are using big presentation decks and splitting and merging them. You need to set the increase Java heap size to 4GB or above to achieve better performance and avoiding such computational issue.

Do you have a trick to make it go down ? Should I run the script for a few slides every time and keep track of the slides that have been used ?

What’s expensive in terms of ressources in splitting slides ?

@albandum

Well there is no trick for this as entire presentation gets loaded in memory. The source and target both presentation objects are extracted in DOM (Document Object Model). It is a resource eating process and varies from case to case depending on content of slide, number of slide and size of presentation etc.

Understood. I’ve upgraded to 25Gb of heap.

My main concern is that memory usage doesn’t go down after the job is finished.

pres = Rjb::import('com.aspose.slides.Presentation').new(ppt_location)

Does this line not need a file.close() somewhere after it’s finished processing ?

I’ve tried that but without luck :

inputstream = FileInputStream.new_with_sig('Ljava.lang.String;', ppt_location)
pres = Rjb::import('com.aspose.slides.Presentation').new(inputstream)
inputstream.close()

I’m just trying to understand why memory pressure doesn’t go down after aspose has finished its work.
I’ve tried adding

    pres.dispose()
    presSplit.dispose()

At the end of the script to liberate any resources taken by the source and destination presentations but this is not helping. Looks like there’s a memory leak somewhere.

Thanks

@albandum

Please share the snapshot of memory consumption and not getting freed, sample project, source presentation, Java and Operating System details. I doubt there is any memory leak but in order to investigate further we need you to please provide the requested information.

I’m running the script on a Docker container (this image: Docker) with java-11-openjdk-amd64 installed. I had to use docker to make sure aspose would work on all systems, as I had issues between my Mac development setup and the live linux servers.

Here is a link with the script that has extensive logging lines about memory pressure, as well as a video that shows that running the same script several times just adds used memory to the container without ever coming back down.

https://drive.google.com/drive/folders/1_Ftc8zXhmeOsn3vD9eR6IGGYb0usw_wP?usp=sharing

I’m wondering if this might be coming from a double RJB load. I the example ou gave me here : I want to split a PPT and merge back in one file · Issue #59 · aspose-free-consulting/projects · GitHub

Where 2 lines are loading jars with RJB, and then running initialize_aspose_slides which contains load_aspose_jars that looks like this :

  def load_aspose_jars(aspose_jars_dir, jvm_args)
    if aspose_jars_dir && File.exist?(aspose_jars_dir)
      jardir = File.join(aspose_jars_dir, '**', '*.jar')
    else
      jardir = File.join(File.dirname(File.dirname(__FILE__)), 'jars', '**', '*.jar')
    end

I’ve moved things around to avoid loading the jars twice but I’m still getting the memory leak.

@albandum

I have created an issue with ID SLIDESJAVA-38319 in our issue tracking system based on the information shared by you. This thread has been linked with the issue so that you may be notified once the issue will be fixed.

Thanks a lot, let me know if you need any extra information. I’ll keep digging on my side as this is a complete blocker for us

@albandum

Can you please also provide the source presentation that has been used so that we may investigate the issue further. Please also provide the Java and Operating System details on your end too along with details about environment.

Hi,

This happens with any presentation, not just one. All tested presentations keep adding memory pressure.

Using OpenJDK11 on a docker container installed this way :slight_smile:

FROM ruby:2.7.1

ENV HOME /home
ENV BUNDLER_VERSION=2.0.2
ENV RAILS_ENV development
ENV JAVA_HOME /usr/lib/jvm/java-11-openjdk-amd64

# INSTALL UTILITIES
RUN apt-get update && apt-get install -y curl build-essential libssl-dev libreadline-dev zlib1g-dev libpq-dev software-properties-common openjdk-11-jdk-headless 

Thanks !

@albandum

We will keep you posted for updates and request for any thing if required.

@albandum

Can you please provide one presentation for testing? Because the problem may be related to the content of this (these) presentation. Our team needs a presentation with which you have been able to reproduce the issue.

Hi, you can test with this one for example : https://k4f4w9c2.stackpathcdn.com/wp-content/uploads/01_big_files_kim7/2020_best_ppt/Abstract%20Ink%20Drop%20PowerPoint%20Templates.pptx

On my side I’ve moved the app to Python and there’s no memory leak there. I’m pretty sure it comes from the RJB bridge

@albandum

I have updated the information shared in our issue tracking system. We request for your patience and we will share feedback with you as soon as the issue will be fixed.

The issues you have found earlier (filed as SLIDESJAVA-38319) have been fixed in this update.