Converting from Doc to html throws out of memory error


#1

Dear Support,

Converting from Doc to html restarts the App server. Log trace is added below for reference.
It is reproducing on cent os machine.
Here is the docx attachment.test.docx.zip (40.3 KB)

Code I am using is below

        ByteArrayOutputStream out = new ByteArrayOutputStream();
    	try {
			Document doc = new Document(input);
			for(String keys :searchKeyWords){
				doc.getRange().replace(Pattern.compile(keys,Pattern.CASE_INSENSITIVE),new MaskResumeDocUtil().new ReplaceEvaluatorFindAndHighlight(),false);
			}
			com.aspose.words.HtmlSaveOptions saveOptions = new com.aspose.words.HtmlSaveOptions(SaveFormat.HTML);
			saveOptions.setEncoding(java.nio.charset.Charset.forName("UTF-8"));//NO I18N
			saveOptions.setExportImagesAsBase64(true);
			doc.save(out,saveOptions);
		} catch (Exception e) {
			out = null;
			LOGGER.log(Level.WARNING, "In MaskResumeDocUtil.getFileStream(). Exception occured "+e);
		}

Log Trace is follows here.

Stack: [0x00007f6a57dfe000,0x00007f6a57eff000],  sp=0x00007f6a57ef7400,  free space=997k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xa3e6ce]  VMError::report_and_die()+0x14e
V  [libjvm.so+0x4de448]  report_java_out_of_memory(char const*)+0x128
V  [libjvm.so+0xa062b6]  TypeArrayKlass::allocate_common(int, bool, Thread*)+0x376
V  [libjvm.so+0x651b0a]  InterpreterRuntime::newarray(JavaThread*, BasicType, int)+0x2a
j  java.awt.image.DataBufferInt.<init>(I)V+11
j  java.awt.image.Raster.createPackedRaster(III[ILjava/awt/Point;)Ljava/awt/image/WritableRaster;+69
j  java.awt.image.DirectColorModel.createCompatibleWritableRaster(II)Ljava/awt/image/WritableRaster;+109
j  java.awt.image.BufferedImage.<init>(III)V+127
j  com.aspose.words.internal.zzYH.<init>(IIFFIB)V+39
j  com.aspose.words.internal.zzYH.<init>(IIFFI)V+12
j  com.aspose.words.zzZ0A.zzZl0()Lcom/aspose/words/internal/zzYH;+210
j  com.aspose.words.zzZ0A.zzZ(Lcom/aspose/words/internal/zz1;JLcom/aspose/words/internal/zz74;Lcom/aspose/words/ImageSaveOptions;Lcom/aspose/words/IWarningCallback;Lcom/aspose/words/internal/zzOT;)V+325
j  com.aspose.words.NodeRendererBase.zzZ(Lcom/aspose/words/internal/zz74;Lcom/aspose/words/ImageSaveOptions;)V+124
j  com.aspose.words.zzZNU.zzZ(Lcom/aspose/words/ShapeBase;Lcom/aspose/words/ShapeRenderer;IZLcom/aspose/words/zzZ0C;Lcom/aspose/words/zzZNW;)Ljava/lang/String;+221
j  com.aspose.words.zzZNU.zzZ(Lcom/aspose/words/ShapeBase;Lcom/aspose/words/zzZ0C;)Lcom/aspose/words/zzZNV;+50
j  com.aspose.words.zzZJA.zzh(Lcom/aspose/words/ShapeBase;)V+331
j  com.aspose.words.zzZJ9.zzf(Lcom/aspose/words/ShapeBase;)I+369
j  com.aspose.words.zzZJE.visitGroupShapeStart(Lcom/aspose/words/GroupShape;)I+14
j  com.aspose.words.GroupShape.zzZ(Lcom/aspose/words/DocumentVisitor;)I+2
j  com.aspose.words.CompositeNode.acceptCore(Lcom/aspose/words/DocumentVisitor;)Z+2
j  com.aspose.words.GroupShape.accept(Lcom/aspose/words/DocumentVisitor;)Z+2
J 3101 C2 com.aspose.words.CompositeNode.acceptChildren(Lcom/aspose/words/DocumentVisitor;)Z (31 bytes) @ 0x00007f6d0b211cfc [0x00007f6d0b211ca0+0x5c]
j  com.aspose.words.CompositeNode.acceptCore(Lcom/aspose/words/DocumentVisitor;)Z+51
j  com.aspose.words.Paragraph.accept(Lcom/aspose/words/DocumentVisitor;)Z+2
J 3101 C2 com.aspose.words.CompositeNode.acceptChildren(Lcom/aspose/words/DocumentVisitor;)Z (31 bytes) @ 0x00007f6d0b211cfc [0x00007f6d0b211ca0+0x5c]
j  com.aspose.words.CompositeNode.acceptCore(Lcom/aspose/words/DocumentVisitor;)Z+51
j  com.aspose.words.Body.accept(Lcom/aspose/words/DocumentVisitor;)Z+2
j  com.aspose.words.zzZWV.zzZm(Lcom/aspose/words/Node;)V+5
j  com.aspose.words.zzZWV.zzZ(Lcom/aspose/words/Body;)V+6
j  com.aspose.words.zzZWV.zzZR7()V+530
j  com.aspose.words.zzZWV.zzZRi()V+10
j  com.aspose.words.zzZS8.zzZ(Lcom/aspose/words/zzZ0B;)Lcom/aspose/words/SaveOutputParameters;+35
j  com.aspose.words.zzZH5.zzZ(Lcom/aspose/words/zzZ0B;)Lcom/aspose/words/SaveOutputParameters;+31
j  com.aspose.words.Document.zzZ(Lcom/aspose/words/zzZ0B;Lcom/aspose/words/SaveOptions;)Lcom/aspose/words/SaveOutputParameters;+726
j  com.aspose.words.Document.zzZ(Lcom/aspose/words/internal/zz74;Ljava/lang/String;Lcom/aspose/words/SaveOptions;)Lcom/aspose/words/SaveOutputParameters;+193
j  com.aspose.words.Document.zzZ(Lcom/aspose/words/internal/zz74;Lcom/aspose/words/SaveOptions;)Lcom/aspose/words/SaveOutputParameters;+25
j  com.aspose.words.Document.save(Ljava/io/OutputStream;Lcom/aspose/words/SaveOptions;)Lcom/aspose/words/SaveOutputParameters;+11

#2

@ganesh.sv

Could you please try the latest version of Aspose.Words for Java 19.3?

If you still face problem, please share the source code without compilation errors here for testing. Thanks for your cooperation.


#3

Dear @tahir.manzoor,

Sorry, I had copied and pasted the code. So, there was a compilation error.

The issue is still reproducible.

I have tested with set up as below.
1. Java 1.8.0_77
2. Tomcat 8.5
3. aspose-words-19.3-jdk17.jar

public static ByteArrayOutputStream getFileStream(InputStream input,ArrayList<String> searchKeyWords){
    	ByteArrayOutputStream out = new ByteArrayOutputStream();
    	try {
			Document doc = new Document(input);
			com.aspose.words.HtmlSaveOptions saveOptions = new com.aspose.words.HtmlSaveOptions(SaveFormat.HTML);
			saveOptions.setEncoding(java.nio.charset.Charset.forName("UTF-8"));
			saveOptions.setExportImagesAsBase64(true);
			doc.save(out,saveOptions);
		} catch (Exception e) {
			out = null;
		}
    	return out;
    }

I have tested the code in main method, conversion happens successfully. But running the code in tomcat container environment reproduces issue.

Exception trace:

Caused by: java.lang.OutOfMemoryError: Java heap space
	at java.awt.image.DataBufferInt.<init>(DataBufferInt.java:75)
	at java.awt.image.Raster.createPackedRaster(Raster.java:467)
	at java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:1032)
	at java.awt.image.BufferedImage.<init>(BufferedImage.java:333)
	at com.aspose.words.internal.zzYH.<init>(Unknown Source)
	at com.aspose.words.internal.zzYH.<init>(Unknown Source)
	at com.aspose.words.zzZ0A.zzZl0(Unknown Source)
	at com.aspose.words.zzZ0A.zzZ(Unknown Source)
	at com.aspose.words.NodeRendererBase.zzZ(Unknown Source)
	at com.aspose.words.zzZNU.zzZ(Unknown Source)
	at com.aspose.words.zzZNU.zzZ(Unknown Source)
	at com.aspose.words.zzZJA.zzh(Unknown Source)
	at com.aspose.words.zzZJ9.zzf(Unknown Source)
	at com.aspose.words.zzZJE.visitGroupShapeStart(Unknown Source)
	at com.aspose.words.GroupShape.zzZ(Unknown Source)
	at com.aspose.words.CompositeNode.acceptCore(Unknown Source)
	at com.aspose.words.GroupShape.accept(Unknown Source)
	at com.aspose.words.CompositeNode.acceptChildren(Unknown Source)
	at com.aspose.words.CompositeNode.acceptCore(Unknown Source)
	at com.aspose.words.Paragraph.accept(Unknown Source)
	at com.aspose.words.CompositeNode.acceptChildren(Unknown Source)
	at com.aspose.words.CompositeNode.acceptCore(Unknown Source)
	at com.aspose.words.Body.accept(Unknown Source)
	at com.aspose.words.zzZWV.zzZm(Unknown Source)
	at com.aspose.words.zzZWV.zzZ(Unknown Source)
	at com.aspose.words.zzZWV.zzZR7(Unknown Source)
	at com.aspose.words.zzZWV.zzZRi(Unknown Source)
	at com.aspose.words.zzZS8.zzZ(Unknown Source)
	at com.aspose.words.zzZH5.zzZ(Unknown Source)
	at com.aspose.words.Document.zzZ(Unknown Source)
	at com.aspose.words.Document.zzZ(Unknown Source)
	at com.aspose.words.Document.zzZ(Unknown Source)
/-- Encapsulated exception ------------\
java.lang.OutOfMemoryError: Java heap space
	at java.awt.image.DataBufferInt.<init>(DataBufferInt.java:75)
	at java.awt.image.Raster.createPackedRaster(Raster.java:467)
	at java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:1032)
	at java.awt.image.BufferedImage.<init>(BufferedImage.java:333)
	at com.aspose.words.internal.zzYH.<init>(Unknown Source)
	at com.aspose.words.internal.zzYH.<init>(Unknown Source)
	at com.aspose.words.zzZ0A.zzZl0(Unknown Source)
	at com.aspose.words.zzZ0A.zzZ(Unknown Source)
	at com.aspose.words.NodeRendererBase.zzZ(Unknown Source)
	at com.aspose.words.zzZNU.zzZ(Unknown Source)
	at com.aspose.words.zzZNU.zzZ(Unknown Source)
	at com.aspose.words.zzZJA.zzh(Unknown Source)
	at com.aspose.words.zzZJ9.zzf(Unknown Source)
	at com.aspose.words.zzZJE.visitGroupShapeStart(Unknown Source)
	at com.aspose.words.GroupShape.zzZ(Unknown Source)
	at com.aspose.words.CompositeNode.acceptCore(Unknown Source)
	at com.aspose.words.GroupShape.accept(Unknown Source)
	at com.aspose.words.CompositeNode.acceptChildren(Unknown Source)
	at com.aspose.words.CompositeNode.acceptCore(Unknown Source)
	at com.aspose.words.Paragraph.accept(Unknown Source)
	at com.aspose.words.CompositeNode.acceptChildren(Unknown Source)
	at com.aspose.words.CompositeNode.acceptCore(Unknown Source)
	at com.aspose.words.Body.accept(Unknown Source)
	at com.aspose.words.zzZWV.zzZm(Unknown Source)
	at com.aspose.words.zzZWV.zzZ(Unknown Source)
	at com.aspose.words.zzZWV.zzZR7(Unknown Source)
	at com.aspose.words.zzZWV.zzZRi(Unknown Source)
	at com.aspose.words.zzZS8.zzZ(Unknown Source)
	at com.aspose.words.zzZH5.zzZ(Unknown Source)
	at com.aspose.words.Document.zzZ(Unknown Source)
	at com.aspose.words.Document.zzZ(Unknown Source)
	at com.aspose.words.Document.zzZ(Unknown Source)
\--------------------------------------/

#4

@ganesh.sv

Please note that performance and memory usage all depend on complexity and size of the documents you are generating.

In terms of memory, Aspose.Words does not have any limitations. If you are loading huge Word documents into Aspose.Words’ DOM, more memory would be required. This is because during processing, the document needs to be held wholly in memory. Usually, Aspose.Words needs 10 times more memory than the original document size to build a DOM in the memory.

We suggest you please use SaveOptions.MemoryOptimization property to optimize the memory performance. Setting this option to true can significantly decrease memory consumption while saving large documents at the cost of slower saving time. Hope this helps you.

com.aspose.words.HtmlSaveOptions saveOptions = new saveOptions.setExportImagesAsBase64(true);
saveOptions.setMemoryOptimization(true);

Moreover, please use the latest version of Aspose.Words for Java 19.4.


#5

@tahir.manzoor

The document size is 40 KB only, so memory size should not be any issue. I can see the following native memory stats when out of memory occurred. It is clear that there was more than 1G memory still available.
Also, I removed the image from the document and try to convert to html there was no Out of Memory Error and conversion was smooth.

  •     Native Memory Tracking:
        Total: reserved=3538993KB, committed=2286357KB
    
  •             Java Heap (reserved=1843200KB, committed=1843200KB)
                          (mmap: reserved=1843200KB, committed=1843200KB) 
    
  •                 Class (reserved=1157452KB, committed=118860KB)
                          (classes #15524)
                          (malloc=2380KB #22884) 
                          (mmap: reserved=1155072KB, committed=116480KB) 
    
  •                Thread (reserved=148241KB, committed=148241KB)
                          (thread #138)
                          (stack: reserved=147456KB, committed=147456KB)
                          (malloc=432KB #701) 
                          (arena=353KB #275)
    
  •                  Code (reserved=257671KB, committed=43627KB)
                          (malloc=8071KB #9554) 
                          (mmap: reserved=249600KB, committed=35556KB) 
    
  •                    GC (reserved=31372KB, committed=31372KB)
                          (malloc=24764KB #373) 
                          (mmap: reserved=6608KB, committed=6608KB) 
    
  •              Compiler (reserved=342KB, committed=342KB)
                          (malloc=211KB #434) 
                          (arena=131KB #3)
    
  •              Internal (reserved=7671KB, committed=7671KB)
                          (malloc=7639KB #53531) 
                          (mmap: reserved=32KB, committed=32KB) 
    
  •                Symbol (reserved=26929KB, committed=26929KB)
                          (malloc=22894KB #276880) 
                          (arena=4036KB #1)
    
  • Native Memory Tracking (reserved=5727KB, committed=5727KB)
                          (malloc=22KB #258) 
                          (tracking overhead=5705KB)
    
  •           Arena Chunk (reserved=3239KB, committed=3239KB)
                          (malloc=3239KB) 
    
  •               Unknown (reserved=57148KB, committed=57148KB)
                          (mmap: reserved=57148KB, committed=57148KB) 
    

The suggested code is also not resolving the issue.


#6

@ganesh.sv

We have tested again the same scenario using the shared document in this thread and code at Tomcat 8.5 and Java 8. We used Aspose.Words for Java 19.4 to test this case. We have not faced any exception or issue.

Please make sure that you are using the same code and document at your end. We suggest you please restart the system and Tomcat server and test the scenario again. Hope this helps you.

If you still face problem, please share the following detail for further testing.

  1. Please use the latest version of Aspose.Words for Java 19.4, create a sample web application and share the WAR file.
  2. Please share the operating system detail on which you are facing this issue.