Excel to HTML conversion performance

Hello.

I use Aspose.Cells to convert Excel files to HTML.
Conversion of slow_conversion_performance.zip (1.3 MB)
takes about 120 seconds on my machine.
This file does not appear to be large (~ 8000x18) cells.
I have a file with a lot of cells that converts faster.

Could you please explain if this is expected for this file?
How can I improve performance here?

@Andrei86,

I checked your template file and found there are lots of over 290K blank (unnecessary) pages in your file. That’s why to render each blank page page will take time. You may remove those blank pages by adding the following lines of code. This should work better now.
e.g.
Sample code:

Aspose.Cells.Workbook workbook = new     Aspose.Cells.Workbook("e:\\test2\\slow_conversion_performance.xlsx");
                    Worksheet worksheet = workbook.Worksheets["Данные"];
                    worksheet.Cells.DeleteBlankColumns();
                    worksheet.Cells.DeleteBlankRows();
                    
                    /*
                    //to browse all the sheets
                    foreach (Worksheet sheet in workbook.Worksheets)
                    {
                        sheet.Cells.DeleteBlankColumns();
                        sheet.Cells.DeleteBlankRows();
                    }
                   */
                    HtmlSaveOptions options = new HtmlSaveOptions();                
                    options.ExportHiddenWorksheet = false;

                    workbook.Save("e:\\test2\\out1.html", options);

Hope, this helps a bit.

I am using Aspose.Cells to convert different documents.
I’ve tried this code before reporting this case.
This code leads to problem for example here
example.zip (3.1 MB)
The image in the second tab overlaps the text.

@Andrei86,

Could you please try the following sample code and give us your feedback. Do you still see performance or overlapping issue?
e.g.
Sample code:

Aspose.Cells.Workbook workbook = new Aspose.Cells.Workbook("e:\\test2\\example.xlsx");
           
            HtmlSaveOptions options = new HtmlSaveOptions();
            options.ExportHiddenWorksheet = false;
            options.ExportPrintAreaOnly = true;

            workbook.Save("e:\\test2\\out1.html", options);

Also, attach your output file.

I tried your suggestion. The performance is still the same. Conversion takes about 120s on my machine.
Output: aspose.zip (894.6 KB)

@Andrei86,
If you take the print preview of the sample Excel file in the first post using MS Excel, you will see that it has 346113 pages. Similarly, if you convert the same file to HTML using MS Excel, you will see that it also takes a lot of time to finish this job. Using the above suggestions if it is taking 120 seconds, that seems to be acceptable.

Regarding the second sample file I tried it with the following sample code but could not observe the issue. Please give it a try and share the feedback.

Workbook workbook = new Workbook("example.xlsx");
Worksheet worksheet = workbook.Worksheets[1];
HtmlSaveOptions options = new HtmlSaveOptions();
options.ExportHiddenWorksheet = false;
options.ExportPrintAreaOnly = true;
workbook.Save("example.html", options);

I tried to use the ExportPrintAreaOnly option for the conversion. In my case, for both files, this setting has no effect on the performance conversion.

@Andrei86,
We have noted your feedback and logged it into our database for a detailed analysis. You will be notified here once any update is ready for sharing.

This issue is logged as:
CELLSNET-49453 - Improve performance while converting Excel to HTML

@Andrei86

We improve the performance in the release v21.10.

For the soruce file “slow_conversion_performance.xlsx”, it can be converted to html in about 80 seconds.

Let me explain a little. The Row 4 has border for every cell till to the largest column(XFD4, total 16384 columns), this make things worse. A possible way is setting a proper print area for the sheet(e.g. A1:R8926), then enable ExportPrintAreaOnly option.
Note: ExportPrintAreaOnly option takes effect only when there is print area on worksheet.

Let us know your feedback.

Thanks.
I will try new version and share my feedback.
About ExportPrintAreaOnly - I convert different files and don’t analyze the content. So I can’t set print area correctly for all files.

@Andrei86,

Please take your time to evaluate new version and let us know your feedback regarding performance.

The issues you have found earlier (filed as CELLSNET-49453) have been fixed in this update. This message was posted using Bugs notification tool by simon.zhao