Is there a way to reduce the processing time when converting some file to html like .xlsx, .ppt, .docx

Hi team,

Is there a way to make reduces the processing time when converting large files like 40MB Excel file with 400,000 rows and may be multiple sheets inside that.

Appreciate your thoughts toward this matter in advance, please let me know if you have a capability to process same file parallelly when converting or something like that approach is there?

I am using your java language for processing files and following dependency we are using.

<dependency>
        <groupId>com.aspose</groupId>
        <artifactId>aspose-cells</artifactId>
        <version>21.9</version>
</dependency>

@hasithwijerathna,

This would be huge file containing billions of cells, so it will surely take a little time and consume CPU to do the conversion. Even MS Excel itself will take time to convert to HTML (“Web page”) or PDF.
Could you please share some sample files and code (please zip the files prior attaching to some file sharing service (Dropbox, Google drive, etc.)). We will check it soon.

Aspose.Cells is written in pure Java, so, concurrency and multithreading should not be a problem by any means. As long as you don’t have shared data source and every time a user access the application a new Workbook or Excel file is generated, there would be no problem at all. However, if you have a shared data/resource, then you will have to do synchronization by yourself. Also, in that case, we recommend you to create/manipulate of filling data into different workbooks in different threads accordingly, because, you should not same one workbook/file in multiple threads at the same time least you would not get stable data due to restrictions and complexity (involved) put forth by MS Excel file format(s) itself.

this is the code that we have used

try (
    ByteArrayInputStream in = new ByteArrayInputStream(fileAsByteArray)) {

    Workbook book = new Workbook(in);

    HtmlSaveOptions saveOptions = new HtmlSaveOptions();
    saveOptions.setExpImageToTempDir(false);
    saveOptions.setExportImagesAsBase64(true);

    final Map<String, ByteArrayOutputStream> entries = new HashMap();

    saveOptions.setStreamProvider(new IStreamProvider() {
        public void initStream(StreamProviderOptions options) throws Exception {
            try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
                final String customPath = uuid + options.getDefaultPath();
                options.setCustomPath(customPath);
                entries.put(customPath, baos);
                options.setStream(baos);
            }
        }

        public void closeStream(StreamProviderOptions options) throws Exception {
            final String customPath = uuid + options.getDefaultPath();

            try(final ByteArrayOutputStream byteArrayOutputStream = entries.get(customPath)){
                s3Client.upload(byteArrayOutputStream.toByteArray(), customPath));
            }
        }
    });

    try(ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream()){
        book.save(byteArrayOutputStream, saveOptions);
        return byteArrayOutputStream.toByteArray();
    }

} catch (Exception e) {
    // exception thrown
}

please find the sample file here.

@hasithwijerathna,

Thanks for the template file and sample code.

We have logged an investigation ticket with an id “CELLSJAVA-44079” for your issue. We will look into it if the conversion time for rendering large Excel file (having 400,000 or more rows) to HTML file could be improved further.

Once we have an update on it, we will let you know.

PS. by the way, could you try using our latest version/fix: Aspose.Cells for Java v21.11 if it makes any difference?

1 Like