Read/Write huge excel files(1M rows)

hi,

thanks for the prompt reply
I explored the sample code for aspose.cells memory preference, there it was written that "This setting cannot take effect for the existing worksheets that are created before using the below line of code"

Hence it seems that we cant use this approach for existing spreadsheets but in our usecase we have existing spreadsheets which are read and write.

Please suggest if there is any work around to it.


Hi,


Well, you did not check the Note: (comments) below this in the code segment pasted in the document. That means by default, memory preference option won’t not set for existing sheet even you use the line of code:
i.e.,
workbook.getSettings().setMemorySetting(MemorySetting.MEMORY_PREFERENCE);

So, you would need to specify the MemoryPreference option for the invidual cells of the sheet(s), see the note (I copy/paste comments and some lines of code from the original example code which sets MemoryPreference option fine for the first worksheet cells):
//To change the memory setting of existing sheets, please change memory setting for them manually:
Cells cells = wb.getWorksheets().get(0).getCells();
cells.setMemorySetting(MemorySetting.MEMORY_PREFERENCE);

Let us know if you still have any issue or confusion.

Thank you.

Hi,


Adding more to my previous reply. Please note the following for your requirements:


1) Because the memory issue may be caused when loading the template file into memory completely, so user might have no chance to use Cells.setMemorySettings() for the large workbook. Instead, kindly use LoadOptions.setMemorySettings() to load/build the workbook in MEMORY_PREFERENCE mode. When the workbook was loaded in MEMORY_PREFERENCE mode, then all its worksheets will use this mode as the default to build cells model. You should try this way firstly to test whether it can solve your memory issue.


2) If MEMORY_PREFERENCE mode cannot solve the issue, you may consider the usage of LightCells. However, I think it is very complicated, and even impossible for users to implement all the logic by themselves if there are other things which need to be changed when inserting rows, such as formulas, named ranges, …etc. To achieve the goal of reducing memory cost for large workbook in template file, you need to combine the usage of LightCellsDataHandler and LightCellsDataProvider. That is, you should load the template file in light cells mode too. your implementation of LightCellsDataProvider will use cells data provided by LightCellsDataHandler to fill the workbook accordingly. It needs two threads to work together, like pipe stream, the loading process provides data and the saving process consumes data. And because you need to insert rows, then he should handle the row index modification too in LightCellsDataProvider, such as, if you insert one row before the third row (A3, B3, the row index is 2), for all row/cells got from LightCellsDataHandler, when their row index is greater than 2, you should fill them into the next row in LightCellsDataProvider(row index 2 becomes 3, 3 becomes 4, …).


Hope, this helps as well.


Thank you.

hi,

Thanx for your prompt replies.
We are now implementing aspose at our end and will be purchasing the library soon.
There is just one issue

While loading workbook object we need to give absolute path of file in string. Is there a way to provide a stream of file path while loading workbook object?

Hi,


Good to know that it’s working on your end now.

Regarding your new issue, well, Aspose.Cells also supports to load/read Excel file from streams, you may pick relevant overload for your requirements, see the document for your reference:
http://www.aspose.com/docs/display/cellsnet/Opening+Files

Thank you.

hi,

Thank you for reply. Streaming is working fine now.
There is just one error :
When I load a workbook object an error appears on our server by aspose api.
This doesnot affect the functioning of the API but still it needs to be removed to make it bug free.

I am mentioning the error below:

ERROR [Aj] com/aspose/cells/a/f/zb
INFO | jvm 1 | 2015/12/11 12:17:04 | java.lang.StringIndexOutOfBoundsException: String index out of range: -1
INFO | jvm 1 | 2015/12/11 12:17:04 | at java.lang.String.substring(String.java:1955) ~[na:1.8.0_45]
INFO | jvm 1 | 2015/12/11 12:17:04 | at org.aspectj.weaver.bcel.BcelObjectType.getOuterClass(BcelObjectType.java:904) ~[aspectjweaver-1.8.6.jar:1.8.6]

Hi,


Good to know that it is working better now.
Regarding your recent error you mentioned, could you try our latest version/fix: Aspose.Cells for Java v8.6.2.4 if it makes any difference.

If you still find the issue, kindly do provide sample JAVA console program (runnable) to reproduce the issue on our end, we will check it soon.

Thank you.

hi,

I am not able to deploy your latest jar. Also its only 6 mb as compared to 20 mb for 8.6.0 version.

Also I am using java 1.8 version 45. So can the error be related to that?

I have now checked for the latest jar you mentioned but I am still getting an error.

can this be a problem of java version??

Hi,


Well, this does not seem to be a problem with JDK version (e.g v1.8), it should work fine. As requested earlier, kindly do provide a sample JAVA console program (runnable) to reproduce the issue on our end, we will check it soon.

Thank you.

hi,

Thank you for the reply
We are trying to reproduce it in a separate sample project but meanwhile could tell is a way to suppress the logs of aspose jar??
Do you provide any properties file for the same?

Hi,


I am afraid, as we have no clue about your issue so far, so we cannot provide any means to suppress the logs you are facing. In some cases, we do provide debug versions for the user to trace the issue but your scenario seems to me a different one. As requested earlier, kindly do provide sample JAVA console program (runnable) to reproduce the issue on our end, we will check it soon.

Thank you.

hi,

This error is occurring when we load workbook object within our product flow exception and is not reproducible in a sample standalone. We have ensured that we are passing the right parameters (path etc.), however it is giving following error in logs.

Although the functionlaity is working fine. There are two options you can help us with here:

  1. In class “com/aspose/cells/a/f/zb” during initialization/loading of workbook object, what could be the reason behind “StringIndexOutOfBoundsException”. Is there any missing/wrong parameter?

  2. Suppress the logs error in following class com/aspose/cells/a/f/zb using log4j.properties file( or relevant log file)

jvm 1 | ERROR [Aj] com/aspose/cells/a/f/zb

jvm 1 | java.lang.StringIndexOutOfBoundsException: String index out of range: -1

jvm 1 | at java.lang.String.substring(String.java:1955) ~[na:1.8.0_45]

jvm 1 | at org.aspectj.weaver.bcel.BcelObjectType.getOuterClass(BcelObjectType.java:904) ~[a

spectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.AbstractReferenceTypeDelegate.getFormalTypeParametersFromOuter

Class(AbstractReferenceTypeDelegate.java:108) ~[aspectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.bcel.BcelObjectType.ensureGenericSignatureUnpacked(BcelObjectT

ype.java:772) ~[aspectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.bcel.BcelObjectType.getDeclaredInterfaces(BcelObjectType.java:

  1. ~[aspectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.ReferenceType.getDeclaredInterfaces(ReferenceType.java:707) ~[

aspectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.ResolvedType.getDirectSupertypes(ResolvedType.java:82) ~[aspec

tjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.patterns.TypePattern.matchesSubtypes(TypePattern.java:178) ~[a

spectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.patterns.TypePattern.matchesStatically(TypePattern.java:130) ~

[aspectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.loadtime.ClassLoaderWeavingAdaptor.accept(ClassLoaderWeavingAd

aptor.java:861) ~[aspectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.tools.WeavingAdaptor.weaveClass(WeavingAdaptor.java:337) ~[asp

ectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at org.aspectj.weaver.loadtime.Aj.preProcess(Aj.java:121) ~[aspectjweaver-1.8.6.jar:

1.8.6]

jvm 1 | at org.aspectj.weaver.loadtime.ClassPreProcessorAgentAdapter.transform(ClassPreProce

ssorAgentAdapter.java:54) [aspectjweaver-1.8.6.jar:1.8.6]

jvm 1 | at sun.instrument.TransformerManager.transform(TransformerManager.java:188) [na:1.8.

0_45]

jvm 1 | at sun.instrument.InstrumentationImpl.transform(InstrumentationImpl.java:428) [na:1.

8.0_45]

jvm 1 | at java.lang.ClassLoader.defineClass1(Native Method) [na:1.8.0_45]

jvm 1 | at java.lang.ClassLoader.defineClass(ClassLoader.java:760) [na:1.8.0_45]

jvm 1 | at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) [na:1.8.0

_45]

jvm 1 | at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) [na:1.8.0_45]

jvm 1 | at java.net.URLClassLoader.access$100(URLClassLoader.java:73) [na:1.8.0_45]

jvm 1 | at java.net.URLClassLoader$1.run(URLClassLoader.java:368) [na:1.8.0_45]

jvm 1 | at java.net.URLClassLoader$1.run(URLClassLoader.java:362) [na:1.8.0_45]

jvm 1 | at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_45]

jvm 1 | at java.net.URLClassLoader.findClass(URLClassLoader.java:361) [na:1.8.0_45]

jvm 1 | at com.pb.spectrum.platform.server.common.deploy.car.CarClassLoader.loadClass(CarCla

ssLoader.java:84) [spectrum-server-common-deploy-car-0-SNAPSHOT.jar:0-SNAPSHOT]

jvm 1 | at java.lang.ClassLoader.loadClass(ClassLoader.java:357) [na:1.8.0_45]

jvm 1 | at com.aspose.cells.a.f.za.b(Unknown Source) [aspose-cells-8.6.0.jar:8.6.0.0]

jvm 1 | at com.aspose.cells.a.f.zi.iterator(Unknown Source) [aspose-cells-8.6.0.jar:8.6.0.0]

jvm 1 | at com.aspose.cells.zajx.e(Unknown Source) [aspose-cells-8.6.0.jar:8.6.0.0]

jvm 1 | at com.aspose.cells.zajx.a(Unknown Source) [aspose-cells-8.6.0.jar:8.6.0.0]

jvm 1 | at com.aspose.cells.zajw.a(Unknown Source) [aspose-cells-8.6.0.jar:8.6.0.0]

jvm 1 | at com.aspose.cells.Workbook.a(Unknown Source) [aspose-cells-8.6.0.jar:8.6.0.0]

jvm 1 | at com.aspose.cells.Workbook.(Unknown Source) [aspose-cells-8.6.0.jar:8.6.0.0]

jvm 1 | at com.pb.spectrum.edi.spreadsheet.impl.SpreadsheetWorkbookAsposeFactory.createWorkb

ook(SpreadsheetWorkbookAsposeFactory.java:42) [edi-0-SNAPSHOT.jar:0-SNAPSHOT]

jvm 1 | at com.pb.spectrum.edi.spreadsheet.impl.SpreadsheetDataSourceRuntimeManagerAsposeImp

l.getWorkbookInfo(SpreadsheetDataSourceRuntimeManagerAsposeImpl.java:158) [edi-0-SNAPSHOT.jar:0-SNAP

SHOT]

We have a critical delivery next week and looking for some quick solution. If required, let us discuss this on call.

Hi,


Thanks for providing us further details.

Well, from the stack trace of the exception you pasted, it seems the issue occurred when reading a zip file (e.g XLSX etc.). So, could you give us your template file so we can test and investigate whether we can find the issue on our end or not.

Thank you.


hi,

Thanks for the lead in this issue.
I have attached the file which we are using.

Hi,


I have tested your issue with your template file using following sample code with our latest version i.e. Aspose.Cells for Java v8.6.3 (you may get it from Downloads module or maven rep.), it works fine and I do not get any exception or stack/trace (logs).
e.g
Sample code:

String filePath = “test33.xlsx”;

Workbook wb = new Workbook(filePath);
wb.save(filePath + “.out1.xlsx”);


Since you are using older version v8.6.0, so we highly recommend you to kindly try our latest (official) Aspose.Cells for Java v8.6.3 if it makes any difference.

Thank you.

hi,

We have tried using the latest version but the issue still persists. We are using stream instead of path while loading the workbook object.

Also we have purchased the library with 1 year support.

So could you please suppress the logs in aspose jar in that particular class and provide us with that jar.

Hi,


Well, I am afraid, we are not able to reproduce the issue (using your provided file) on our end, so we cannot provide much help to sort your issue out. Neither we provide any means (via API attributes) to suppress the logs you are facing as we do not face such errors.

We appreciate if you could provide us a separate executable console application (JAVA program) to reproduce the issue on our end, so we could log it into our database and figure it out as soon as possible.

Thank you.

hi,

Thanx for reply.
We are looking into our issue meanwhile could you guide us in following issue

We are using inputstream to load workbook oject but if inputstream is empty then aspose freezes for 100 sec and our system hangs.

Could you provide a way so that it throws an exception is inputstream is empty.

Hi,


Well, if inputstream is null, Aspose.Cells would throw NullPointerException. Also, I have tested your scenario using v8.6.3.x with an empty stream, it works efficiently and fine. I think you may try to use FileInputStream.available() method to check if the stream is empty or not, so you should not load empty stream into Workbook object. See the sample code below for your reference:
e.g
Sample code:

FileInputStream fis = new FileInputStream(new File(“abc.txt”));
int empty =fis.available();
if(empty==0)
{
System.out.println("!!File is empty!!");
}

Workbook workbook = new Workbook(fis); //It also works instantly.

System.out.println(“Done!”);

Thank you.