I am evaluating Aspose.Cells for Java, and the program I have developed is working fine. There is one issue: I am creating a ~26000-row, 22-column spreadsheet with various formatting (font color), and many of the cells have comments written to them.
The writing of the comments seems to be very slow. The program runs in 1 hour, 8 minutes. If I simply comment out the line that writes the comment to the cell (the comment.setNote call), the program runs in 12 seconds.
Are there plans to make this more efficient? Or is there something that can be done to speed up comment writes? I am writing to the spreadsheet, as suggested in the documentation, row-by-row for performance optimization.
When adding
a Comment to a cell, it is needed to check whether there has been an
existing Comment for this cell. When there are a large amount of
Comments, this check will cost some time. Please try the attached version. Now we have added
an overload method for user to avoid such check:
Comment
com.aspose.cells.Shapes.addComment(int row, int column, boolean checkExisted,
byte placementType)
Parameters:
row
row index
column
column index
checkExisted whether to check existing Comments to check whether there has
been an existing Comment object with same row and column. If there are large
amount of Comments and user can be sure that there is no existing Comment at
given row and column, disable this flag can improve performance to certain extent.
placementType the placement type of Comment. Setting it to PlacementType.MOVE will get best performance.
Returns:
Comment Object.
For
your situation, I think you can invoke this method like following to improve
the performance:
I installed the modified version and changed the addComment call as you suggessted.
It did run faster, technically. The run time went from 1 hour, 8 minutes down to 1 hour, 2 minutes. Commenting out the code that writes the comments to the sheet, the code runs in 12 - 13 seconds.
The part of my code that writes the comments is given here:
if (CommentData[row][column] != null) { Comment comment = worksheet.getShapes().addComment(row,column,false,PlacementType.MOVE); Font font = comment.getFont(); font.setSize(10); comment.setNote(CommentData[row][column]); comment.setWidth(CommentWidth[row][column] * 7); comment.setHeight(CommentHeight[row][column] * 17); comment.setFont(font); }
where CommentData[row][column] is processed and assigned elsewhere in the code. This whole block is essentially in a nested for…next loop such that it is processed row-by-row and not column-by-column.
I would normally figure I that wrote inefficient code, but, as I stated, if I comment out the section of code listed above (comments get processed and assigned but not written to the worksheet), the code runs in 12 - 13 seconds as opposed to over an hour otherwise. It appears as though writing the comments is the bottleneck.
Any other ideas? Is there a more efficient way to code the comment writes than that which I have listed above?
One thing I forgot to mention. With the updated version, I can open the resulting .xls file in Excel just fine, but Excel crashes when I exit (this did not happen before), and I have to go into the Task Manager and kill the Excel.exe process.
We have improved the performance of searching existing comments, so, we have
removed the "checkExisted" parameter in v2.1.1.22 and the method now has been
changed to:
Comment
com.aspose.cells.Shapes.addComment(int row, int column, byte placementType)
Please try the new fix (attached). For the generated file, if it still causes Excel crashed, I am afraid, we
need your sample project to trace the issue because by our test the
generated file can be opened and closed by Excel fine.
Here is my test code and the resultant file is also attached.
Once again, thank you very much for your quick reply.
I loaded in the v.2.1.1.23 version, re-compiled, and re-ran my program. It still runs in 1 hour, 2 minutes (no change from v2.1.1.22), but the Excel crashing problem has disappeared.
It will be much better for us to trace the issue if you can provide us your template file and code. Before that, would you please check whether you have given enough memory to JVM to execute your program? When program runs to the memory limit, most of time will be used for GC and application's progress will become much slower. By our test, to create 20000x22 Comments, it needs at least 950M memory. When give less memory to JVM, the progress will be very slow and finally maybe cause OutOfMemory error.
Hi Sean,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
Thank you for your test code and template files.
After check your code, we found the performance issue has nothing to do with creating Comments but is caused due to Worksheet.autoFitColumn() operation. When the cells set is large, auto-fit operation needs much time. For your situation, because there are few custom styles and data types, we think you can use an alternative and simpler procedure to process column width if auto-fit operation is really needed. Please see the attached code that has been modified based on your original code. Please look into the codes with heading comment like "/*******************************ASPOSE: "
In the code we get the maximum text of cell value (simply by the text length) in every column and take it's width as the auto-fit width of this column. The column width got by this way may be not so accurate because text width is not determined by its length only, but it does save much more time when removing many other complex operations such as Alignment, rotation, row height for every cell/value in the large cells set.
I’ll look into it, but what you are saying doesn’t make sense. As I said above, if I comment out the section that writes the comments to the cells, leaving in the column auto-fits, the 2-hour runtime goes down to 12 - 13 seconds. That indicates that the bottleneck is with the comments, not the auto-fits.
Please try that with the code I gave you – comment out the if block where the comments are written and re-compile/re-run the program, and you will see the same results.
I will try as you suggested, but our eval license ran out last Friday. Can it be extended?
Wow! You’re right! I commented out the for loop that auto-fit the colums and left in the comment writes, and it ran in 15 seconds! Amazing!
This must mean that auto-fitting a column must go through and process comments since, if I leave out the comment writes, but there is just as much cell data, the program runs in 12 - 13 seconds.
Is there a way the auto-fit can bypass checking comments in the columns?
Wow! You’re right! I commented out the for loop that auto-fit the
colums and left in the comment writes, and it ran in 15 seconds!
Amazing!
This
must mean that auto-fitting a column must go through and process
comments since, if I leave out the comment writes, but there is just as
much cell data, the program runs in 12 - 13 seconds.
Is there a way the auto-fit can bypass checking comments in the columns?
Thank you very much! This is great!
One last thing. Excel is once again crashing upon closing of the
generated .xls file with the latest version of Aspose.Cells you sent
(2.1.1.26).