We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Read only worksheet names from a sample on network

Hi,


I am trying to read the sheet names from an InputStream to an xlsx file on Dropbox using the following code

<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>LoadOptions opts = <span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> LoadOptions();<br style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”><span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>opts.setLoadDataFilterOptions(LoadDataFilterOptions.NONE);<br style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”><span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>Workbook book = <span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> Workbook(inputStream, opts);
<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
This takes around 7 seconds for a 300 kb file but more than 2 minutes for a 20 MB file. It looks like it reads the entire file data before loading the sheet names.

Is there a way to do this for large files over the network so that the time to read the sheet names does not increase proportionally with file size?

Thanks,
Piyush

Hi Piyush,


Thank you for contacting Aspose support.

I believe, the mentioned processing time (7 seconds for 300 KB sample & 2 minutes for 20 MB sample) is for the complete process, that is; download the sample from network location, read the sample in an instance of InputStream, and then finally load the InputStream to an instance of Workbook. Please note, only the last portion of the process is related to Aspose.Cells APIs whereas the first 2 parts comprise your own custom logic. Moreover, the time to complete the 1st part of the process mainly depends on the file size and the strength of your internet connection. I am afraid, Aspose.Cells API cannot speed up the downloading process, however, if you are facing the performance lag while loading the InputStream to an instance of Workbook then please share the sample along with the elapsed time to load the sample in Aspose.Cells object model.

Thanks Babar for the quick response. Here is the code that takes an input stream returned by an httpClient and reads the sheet names.


AuthorizationCodeOAuth2Authenticator auth = AuthorizationCodeOAuth2Authenticator.getInstance();
DropboxFileTransport transport = new DropboxFileTransport(auth);

Map<String,Object> headers = new HashMap<>();
headers.put(“Dropbox-API-Arg”, “{“path”: “” + “/20MB.xlsx” + “”}”);
long start = System.currentTimeMillis();
LOGGER.debug(“Loading metadata for xls”);
InputStream is = transport.contentClient.download(“files/download”, null, headers, null);
long end = System.currentTimeMillis();
System.out.println("Download api call - " + (end-start)/1000 + “seconds”);

LoadOptions loadNoData = new LoadOptions();
loadNoData.setLoadFilter(new LoadFilter(LoadDataFilterOptions.NONE));
Workbook workbook = new Workbook(is, loadNoData);
end = System.currentTimeMillis();
System.out.println(“Workbook initialized for xls in -” + (end-start)/1000 + “seconds”);

WorksheetCollection worksheets = workbook.getWorksheets();
@SuppressWarnings(“unchecked”)
Iterator wsIterator = worksheets.iterator();
if (worksheets.getCount() == 1 && wsIterator.hasNext()) {
Worksheet ws = wsIterator.next();
String sheetName = ws.getName();
} else {
while (wsIterator.hasNext()) {
Worksheet ws = wsIterator.next();
String sheetName = ws.getName();
}
}
end = System.currentTimeMillis();
System.out.println(“Loaded metadata for xls in -” + (end-start)/1000 + “seconds”);

Here DropboxFileTransport makes an http request to download a file and returns an InputStream. This takes constant time for both files.
Then when the workbook is initialized by passing the input stream, it takes longer for larger files.

Here are the responses:
For 300 kb file
Download api call - 4seconds
Workbook initialized for xls in - 6seconds
Loaded metadata for xls in - 6seconds

For 20 MB file
Download api call - 4seconds
Workbook initialized for xls in - 61seconds
Loaded metadata for xls in - 61seconds

Hi again,


Thank you for the sample code.

First of all, please note that you are using an older revision of the API because you are using a few APIs that have been obsoleted in recent revisions. The latest code segment is available at the bottom of this post that you should try on your side against Aspose.Cells for Java 17.02.0. Secondly, the lager spreadsheets will definitely take more time to load, however, the difference should not be much (6 seconds compared to 61 seconds). I want to give the scenario a try on my end as well but I am missing the implementations for several classes/methods from your code segment. I humbly request you to please provide an executable sample application along with all dependencies (Jars) and template files (input spreadsheets) so I may test the case, and raise it with the product team, if required. It would be appropriate that you should share the sample spreadsheet separately as well as via URLs, which you are accessing via your code.

Java

LoadOptions options = new LoadOptions();
LoadFilter filter = new LoadFilter();
filter.setLoadDataFilterOptions(LoadDataFilterOptions.NONE);
options.setLoadFilter(filter);

Workbook book = new Workbook(stream, options);
WorksheetCollection worksheets = book.getWorksheets();
for(int i = 0; i < worksheets.getCount(); i++)
{
if(worksheets.get(i).getCells().getMaxDataRow() == -1)
{
System.out.println(worksheets.get(i).getName());
}
}

Hi Piyush,


This is to update you that I have performed a few tests by first preparing samples having same number of worksheets but different amount of data. Attached you will find a snapshot showing the size of 2 XLS files where sample-less-data.xls is of 300 KB and sample-more-data.xls is of 20 MB. The following code revealed that both samples took almost similar time to load in an instance of Workbook against the latest revision of Aspose.Cells for Java 17.02.0.

Java

File initialFile = new File(dir + “sample-more-data.xls”);
InputStream stream = new FileInputStream(initialFile);

long start = System.currentTimeMillis();

LoadOptions options = new LoadOptions();
LoadFilter filter = new LoadFilter();
filter.setLoadDataFilterOptions(LoadDataFilterOptions.NONE);
options.setLoadFilter(filter);
Workbook book = new Workbook(stream, options);

long end = System.currentTimeMillis();
System.out.println(“Loading sample-more-data.xls - " + (end-start) + " milliseconds”);

WorksheetCollection worksheets = book.getWorksheets();
for(int i = 0; i < worksheets.getCount(); i++)
{
if(worksheets.get(i).getCells().getMaxDataRow() == -1)
{
System.out.println(worksheets.get(i).getName());
}
}

Output

JDK: 1.8.0_111 Aspose.Cells: 17.02.0 Loading sample-more-data.xls - 856 milliseconds Sheet1 Sheet2 Sheet3

JDK: 1.8.0_111
Aspose.Cells: 17.02.0
Loading sample-less-data.xls - 659 milliseconds
Sheet1
Sheet2
Sheet3

Hi Babar,


Thanks for the detailed response. This code is reading a file from the local directory. I am trying to read it from a stream from Dropbox cloud. Please find attached the eclipse project that I created to test this. Let me know if you require more information.

Here is the output on executing the project for different files-
File - "/Northwind.xlsx"
API response time - 4 seconds
Response Status - 200
Time to initialize Workbook - 2 seconds
Sheets :
Categories
Customers
Employees
Orders
Products
Shippers
Suppliers
Order Details

File - "/200MB.xlsx"
API response time - 3 seconds
Response Status - 200
Time to initialize Workbook - 31 seconds
Sheets :
Sheet1
Sheet2

Hi again,

I have received the project however, it is still missing the dependencies as shown in attached snapshot. You haven’t shared the sample spreadsheets as well. Please note, downloading the sample from any remote location is out of the scope of Aspose.Cells APIs therefore we will not be able to evaluate that part of the application nor can suggest improvements. However, we need to make sure that each sample is loaded in Aspose.Cells object model (via stream or file path) with more or less similar time cost, and we suggest you to evaluate the scenario on similar lines. By the way, have you given the latest version (17.02.0) a try on your end? What results have you got?