Speed-Up suggestions

Hello,

while merging and creating a lot of Document-objects we observed that the default constructor takes a significant amount of time to complete. After debugging around the issue it seems to be related to jar-signing in conjunction with access to “resources/Blank.doc”. After caching it’s content in memory we got twice the speed than before (20000 iterations dropped from 15500 to 6800 milliseconds). Please see the following code.

long t = System.currentTimeMillis();
for (int i = 0; i <20000; i++)
{
    Document doc = new Document();
}
System.out.println(System.currentTimeMillis() - t);

InputStream resourceAsStream = getClass().getResourceAsStream("/resources/Blank.doc");
ByteArrayOutputStream bout = new ByteArrayOutputStream();
IOUtils.copy(resourceAsStream, bout);
resourceAsStream.close();
bout.close();
byte[] byteArray = bout.toByteArray();

t = System.currentTimeMillis();
for (int i = 0; i <20000; i++)
{
    Document doc = new Document(new ByteArrayInputStream(byteArray));
}
System.out.println(System.currentTimeMillis() - t);

Maybe this code is useful and could make it in a future release.

Best regards
Klemens Schrage

Hi Klemens,

Thanks for your suggestion. Please note that Aspose.Words for Java takes extra time when loading for the very first time. Please execute the following code snippet at your end and see the difference. Please let us know if you have any more queries.

for (int i = 0; i <100; i++)
{
    Long t = System.currentTimeMillis();
    Document doc = new Document();
    System.out.print(System.currentTimeMillis() - t);
    System.out.print(",");
}
System.out.println("==========================");
InputStream resourceAsStream = getClass().getResourceAsStream("/resources/Blank.doc");
ByteArrayOutputStream bout = new ByteArrayOutputStream();
IOUtils.copy(resourceAsStream, bout);
resourceAsStream.close();
bout.close();
byte[] byteArray = bout.toByteArray();
for (int i = 0; i <100; i++)
{
    Long t = System.currentTimeMillis();
    Document doc = new Document(new ByteArrayInputStream(byteArray));
    System.out.print(System.currentTimeMillis() - t);
    System.out.print(",");
}

Hi,

you are right with this. On my machine it took 312ms for the first “new Document()”. Nevertheless after pulling the first call out of the first loop I still see the second loop about 2 times faster than the first. I think jar verification and so on is a possible bottleneck in this situation.

Best regards
Klemens Schrage

Hi Klemens,

Please execute the following code snippet multiple times at your end and see the difference. Approximate time of loading document in the loop is same. Please decrease the loop count and check the difference. The loading time will be approximately same.

InputStream resourceAsStream = getClass().getResourceAsStream("/resources/Blank.doc");
ByteArrayOutputStream bout = new ByteArrayOutputStream();
IOUtils.copy(resourceAsStream, bout);
resourceAsStream.close();
bout.close();
byte[] byteArray = bout.toByteArray();
Document doc1 = new Document(new ByteArrayInputStream(byteArray));
Long t3 = System.currentTimeMillis();
for (int i = 0; i <1000; i++)
{
    Document doc11 = new Document(new ByteArrayInputStream(byteArray));
}
System.out.println(System.currentTimeMillis() - t3);
Document doc2 = new Document();
Long t = System.currentTimeMillis();
for (int i = 0; i <1000; i++)
{
    Document doc22 = new Document();
}
System.out.println(System.currentTimeMillis() - t);

Hi Klemens,

Thanks for your inquiry.

Just to let you know if you are working with the same document many times then it is faster to take the original document and use Clone to create another copy instead of reloading it through the constructor.

Thanks,

Hi Adam,

thanks for your suggestion. Indeed that’s the fastest solution by far. I can integrate that, but wouldn’t it be a possibility to embed this functionality within the Document class at your side? I think it can be assumed that resources/blank.doc will never change during runtime.

Best regards
Klemens Schrage

Hi Klemens,

Thanks for your inquiry. The deepClone method serves as a copy constructor for nodes. The cloned node has no parent, but belongs to the same document as the original node. This method always performs a deep copy of the node. In your case, the Adam’s suggestion fulfill your requirements. Please use the Node.deepClone method to get the new empty document as shown in following code snippet. Hope this answers your query. Please let us know if you have any more queries.

InputStream resourceAsStream = getClass().getResourceAsStream("/resources/Blank.doc");
ByteArrayOutputStream bout = new ByteArrayOutputStream();
IOUtils.copy(resourceAsStream, bout);
resourceAsStream.close();
bout.close();
byte[] byteArray = bout.toByteArray();
Document doc1 = new Document(new ByteArrayInputStream(byteArray));
Long t3 = System.currentTimeMillis();
for (int i = 0; i <20000; i++)
{
    Document doc1Clone = (Document) doc1.deepClone(true);
}
System.out.println(System.currentTimeMillis() - t3);
Document doc2 = new Document();
Long t = System.currentTimeMillis();
for (int i = 0; i <20000; i++)
{
    Document doc2Clone = (Document) doc2.deepClone(true);
}
System.out.println(System.currentTimeMillis() - t);

For your information, a variant of this problem occurs when Aspose is used in conjunction with JRebel:

for (int idx = 0, idx <200, idx + +) new DocumentBuilder();

Normally takes around 100ms. With JRebel, he takes 10s!

The previous workaround seems to work:

private static byte[] getBlankDoc()
{
    try
    {
        InputStream inputStream;
        Class docClass = Document.class;
        inputStream = docClass.getResourceAsStream("/resources/Blank.doc");
        ByteArrayOutputStream bout = new ByteArrayOutputStream();
        IOUtils.copy(inputStream, bout);
        inputStream.close();
        bout.close();
        return bout.toByteArray();
    }
    catch (Exception e)
    {
        return new byte[0];
    }
}

static byte[] byteArrayBlankDoc = getBlankDoc();

Document docBlank = new Document(new ByteArrayInputStream(byteArrayBlankDoc));
DocumentBuilder unAmendementBuilder = new DocumentBuilder(docBlank);

We are not satisfied with this workaround and prefer to do without JRebel (for the moment).

Hi Gilles,

Thanks for your inquiry. The deepClone method serves as a copy constructor for nodes. This is not a workaround. Please check my reply to Mr. Klemens at this link.

Could you please share some more detail about your query related to JMeter along with code? I will investigate the issue on my side and provide you more information. Are you using the same code shared at following link:
https://forum.aspose.com/t/57896

The issue with JRebel also requires the use of Tomcat.

With Tomcat 7.0.27 or before, the time of the creation of 200 Documents (new Document ():wink: takes 100ms without JRebel, and 10s with JRebel.

JRebel just found the reason of the bug and it is now fixed:

"I made a fix into the nightly build so that JRebel would
relay more on the servers own caches for static jar resources. Could you try
the nightly build:
http://zeroturnaround.com/software/jrebel/early-access/
The performance penalty should be much smaller for 7.0.27
and older.
"

With Tomcat 7.0.28 or later, the bug is present even without JRebel: From Tomcat 7.0.28 “new Document ();” is 100 times longer. (0.5 ms vs. 50 ms)

Currently we are using Tomcat 7.0.27 and do not intend to upgrade version shortly. This is not urgent for us. However, can you fix this issue for a future release.

Hi Gilles,

Thanks for your inquiry. I will investigate the issue on my side and provide you more information.

Hi Gilles,

Thanks for your patience. I have tested the scenario with deepClone method at following environment and have not found any issue.

  • Windows 7 64 bit
  • JDK 1.7
  • apache-tomcat verstion : 7.0.41

In your case, I suggest you please use the deepClone method as shared in my previous post. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.