Reduce memory consumption when document contains many images


#1

Hi,

We are currently trying to export lots of images into word using asopse words.
The Java API used by DocumentBuilder to insert image (insertImage) provides many overloaded variants. The question is which one of them is proper to save memory consumption?
To be mentioned that in our application we cache images in disks and then pass them as InputStream to insertImage(). Is this a right solution?

I also see that there is an option called memoryOptimization (saveOptions.setMemoryOptimization(true)) which apparently optimize memory leakage. Does this option help to reduce heap memory?

Please let us know if there are any other ways to deal with memory optimization in Aspose-words based applications.

Thanks


#2

@behrouz,

If the images are stored on disk, we suggest you please use the following overload of insertImage method:
public Shape insertImage(java.lang.String fileName)

Secondly yes, specifying true to the SaveOptions.setMemoryOptimization(boolean) can significantly decrease memory consumption while saving large documents at the cost of slower saving time. Please let us know if we can be of any further assistance.


#4

@behrouz,

Please ZIP and attach your input Word document, Aspose.Words generated output document and piece of source code here for testing. We will then investigate the issue on our end and provide you more information.


#5

@awais.hafeez

Unfortunately extracting the code from our software is somehow difficult but I can provide you with some information about how we construct a Document with Aspose. In the following, I mention the steps where we initiate a document from template files, insert images to document and then save the final document.

// construction section
Document document = new Document(templateWords.getContentAsStream());
DocumentBuilder writer = new DocumentBuilder(document);
SaveOptions saveOptions = SaveOptions.createSaveOptions(SaveFormat.DOCM : SaveFormat.DOCX);
saveOptions.setMemoryOptimization(true);

// insert image section
Shape shape = writer.insertImage(imagName);
shape.setAllowOverlap(false);
shape.setWrapType(WrapType.INLINE);
boolean wasAspectRatioLocked = shape.getAspectRatioLocked();
shape.setAspectRatioLocked(false);
… set shape size …
shape.setAspectRatioLocked(wasAspectRatioLocked);

// save section
writer.getDocument().save(outputStream, saveOptions);

I get OutOfMemory error after optimizing the document export as you recommended before.
I passed the image file name instead of InputStream as well as enabled Optimization flag in save time.
Here is the stack trace for the error.

scroll-office-export:thread-1
at java.lang.OutOfMemoryError.()V (OutOfMemoryError.java:48)
at com.aspose.words.internal.zz76.zzUJ(I)Z (:1202)
at com.aspose.words.internal.zz76.write([BII)V (:371)
at com.aspose.words.internal.zz2F.write([BII)V (:47)
at java.util.zip.DeflaterOutputStream.deflate()V (DeflaterOutputStream.java:253)
at java.util.zip.DeflaterOutputStream.write([BII)V (DeflaterOutputStream.java:211)
at java.util.zip.ZipOutputStream.write([BII)V (ZipOutputStream.java:331)
at com.aspose.words.internal.zzW.zzZ(Lcom/aspose/words/internal/zz74;Ljava/io/OutputStream;)V (:17122)
at com.aspose.words.internal.zz8L.zzY(Ljava/lang/String;Lcom/aspose/words/internal/zz74;)V (:56)
at com.aspose.words.internal.zzJ4.zzP(Lcom/aspose/words/internal/zz74;)V (:78)
at com.aspose.words.zz70.zzZ(Lcom/aspose/words/zzZ0B;)Lcom/aspose/words/SaveOutputParameters; (:84)
at com.aspose.words.zzZH5.zzZ(Lcom/aspose/words/zzZ0B;)Lcom/aspose/words/SaveOutputParameters; (:26)
at com.aspose.words.Document.zzZ(Lcom/aspose/words/zzZ0B;Lcom/aspose/words/SaveOptions;)Lcom/aspose/words/SaveOutputParameters; (:1792)
at com.aspose.words.Document.zzZ(Lcom/aspose/words/internal/zz74;Ljava/lang/String;Lcom/aspose/words/SaveOptions;)Lcom/aspose/words/SaveOutputParameters; (:965)
at com.aspose.words.Document.zzZ(Lcom/aspose/words/internal/zz74;Lcom/aspose/words/SaveOptions;)Lcom/aspose/words/SaveOutputParameters; (:1036)
at com.aspose.words.Document.save(Ljava/io/OutputStream;Lcom/aspose/words/SaveOptions;)Lcom/aspose/words/SaveOutputParameters; (:1026)
at com.k15t.scroll.words.exporter.WordsExporter.createOutputArtifact(Ljava/io/OutputStream;)V (WordsExporter.java:567)
at com.k15t.scroll.words.exporter.WordsExporter.save(Ljava/io/OutputStream;)V (WordsExporter.java:504)
at com.k15t.scroll.exporter.pipeline.output.ToWordFileWriter.apply(Lcom/k15t/scroll/exporter/pipeline/context/main/WordConvertExportContext;)Ljava/lang/Object; (ToWordFileWriter.java:31)
at com.k15t.scroll.exporter.pipeline.output.ToWordFileWriter.apply(Ljava/lang/Object;)Ljava/lang/Object; (ToWordFileWriter.java:18)
at java.util.Optional.map(Ljava/util/function/Function;)Ljava/util/Optional; (Optional.java:215)
at com.k15t.scroll.exporter.pipeline.WordPipeline.export(Lcom/k15t/scroll/exporter/pipeline/context/main/MainPipelineInput;Lcom/k15t/scroll/exporter/job/lifecycle/LifecycleService;)V (WordPipeline.java:134)
at com.k15t.scroll.exporter.job.WordPipelineDispatcher.dispatch(Lcom/k15t/scroll/exporter/exchange/AddOn;Lcom/k15t/scroll/exporter/pipeline/context/main/MainPipelineInput;Lcom/k15t/scroll/exporter/job/lifecycle/LifecycleService;)V (WordPipelineDispatcher.java:21)
at com.k15t.scroll.exporter.job.ExportJobRunnable.run()V (ExportJobRunnable.java:92)
at com.k15t.scroll.exporter.job.context.ServerContextService$ServerContextAwareRunnable.run()V (ServerContextService.java:151)
at java.util.concurrent.Executors$RunnableAdapter.call()Ljava/lang/Object; (Executors.java:511)
at java.util.concurrent.FutureTask.run()V (FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V (ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run()V (ThreadPoolExecutor.java:624)
at java.lang.Thread.run()V (Thread.java:748)

I attach the screenshot of memory used by Aspose before and after enabling the optimization flag for this scenario. without enabled flag.png (542.9 KB)
with enabled flag.png (590.7 KB)

As you can see from the attachments, memory consumption is not improved considerably.


#6

@behrouz,

Have you tried the latest version of Aspose.Words for Java i.e. 19.1? In case the problem still remains, please ZIP and attach your input Word document, Aspose.Words generated output document (if any), the image file and piece of source code here for testing. We will then be able to investigate the issue on our end and provide you more information. Thanks for your cooperation.


#7

@awais.hafeez

I hope you are well,

I created a simple application to demonstrate our issue concerning memory consumption in Aspose.
Please create at least 1500 copy of images in the resource directory of the project and set JVM option to ‘-Xms256m -Xmx2048m’. A valid license needs to be added to the code to work correctly.

aspose-memory-app.zip (2.2 MB)

Regards,
Behrouz


#8

@behrouz,

We are checking this scenario and will get back to you soon.


#9

@behrouz,

We tested the scenario and have managed to reproduce the same problem on our end. For the sake of correction, we have logged this problem in our issue tracking system. The ID of this issue is WORDSJAVA-2067. We will further look into the details of this problem and will keep you updated on the status of correction. We apologize for your inconvenience.


#10

@behrouz,

Please check below the analysis details of WORDSJAVA-2067:

Usage of com.aspose.words.DocumentBuilder#insertImage(java.io.InputStream) is fairly good solution. However, com.aspose.words.DocumentBuilder#insertImage(java.lang.String) demonstrates slightly better speed. While com.aspose.words.DocumentBuilder#insertImage(byte[]) is almost the same as the latter.

But avoid usage of com.aspose.words.DocumentBuilder#insertImage(java.awt.image.BufferedImage). This method allocates additional memory for intermediate buffers (Java’s BufferedImage doesn’t keep information of original image file format, so we try to keep image quality at its best at the price of memory consumption).

This option saves approximately 20MB of heap memory. But it doesn’t affect image insertion.

Returning to your example, we had 1500 files. Each file had size of 1 156 447 bytes.
1500 files consumed at least 1 734 670 500 bytes in heap. Even taking into account overhead of Java (and Aspose) classes we should fit into 2GB of the heap (e.g. -Xms256m -Xmx2048m ). And we did actually. But we adjusted Garbage Collector settings for achieving it.

Thus, the following setting made all of 1500 images inserted via each of insert -method (except com.aspose.words.DocumentBuilder#insertImage(java.awt.image.BufferedImage) - see the caution above):

-XX:NewRatio=3 - for Parallel Garbage Collector
-XX:+UseG1GC -XX:+UnlockExperimentalVMOptions -XX:G1HeapRegionSize=4m - for G1 Garbage Collector


In general, high memory consumption is caused by the nature of Word documents (among others). Such a document cannot be processed in batch mode, e.g. the whole document should be built in memory before saving it to a disk (with all of the images).