Aspose.Cells GridJs for .NET Pagination for larger files

We are currently using Aspose GridJs. While the current implementation works well for moderate sized worksheets, we are encountering performance and memory challenges with very large datasets especially those containing millions of rows. (> ~50mb size)

Currently, the lazy loading implementation appears to work at a sheet level, and there doesn’t seem to be an in-built way to load rows by a specific offset and page size directly from the workbook in a memory-efficient manner. For extremely large files, attempting to load the full workbook or even copy a portion of it into a temporary workbook just to get a few rows can be costly.

Feature Request:
We would like to request support for a true row-based streaming/lazy-loading mechanism via GridJs (both frontend and backend) that loads only the requested rows directly from disk without loading the full workbook into memory.

This would greatly improve scalability and performance for applications that deal with large Excel files and reduce both memory usage and processing time.

We’d be happy to provide more details or test this if considered.
Thank you for your excellent library and continued support.

@TUTxAjiT

To address the performance and memory challenges you’re facing with large datasets in Aspose.GridJs, it’s crucial to implement a more efficient data loading strategy. Currently, the lazy loading feature operates at the sheet level, which may not be optimal for very large files.

Cause:

The existing lazy loading mechanism may lead to excessive memory usage and performance issues when handling large datasets, as it loads entire sheets rather than just the necessary data.

Solution:

  1. Implement Row-Based Streaming: Create a custom solution that allows for row-based streaming. This involves modifying your backend to fetch only the required rows based on the requested offset and page size.

  2. Use GridCacheForStream: Utilize the GridCacheForStream class to manage how data is cached and retrieved. This will help reduce memory usage by not loading the entire workbook into memory.

  3. Adjust Configuration Settings: Review and adjust the configuration settings for GridJs. Enabling LazyLoading can help load only the active worksheet, which may improve performance.

  4. Optimize File Cache Directory: Ensure that the FileCacheDirectory is set to a specific path to manage cached files effectively.

Next Steps:

Here’s a basic example of how you might set up a controller action to fetch data based on pagination:

public IActionResult GetRows(int offset, int pageSize)
{
    var workbook = new GridJsWorkbook();
    workbook.ImportExcelFile("path/to/your/excel/file.xlsx");

    var rows = workbook.GetRows(offset, pageSize); // Custom method to get rows based on offset and page size
    return Json(rows);
}

This approach allows you to load only the necessary rows, significantly improving both performance and memory efficiency.

If you have further questions or need assistance with specific implementations, please reach out for support.

Helpful Resources:

I look forward to hearing how this solution works for you!

Looks like it doesn’t work. I would prefer a real agent support than a bot

@TUTxAjiT,

Thanks for the details.

We require thorough evaluation of your requested feature, i.e., Support row-based streaming/lazy-loading mechanism in GridJs to load only selected rows directly from disk without loading the full workbook into memory. The feature is complex. Anyways, we have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CELLSGRIDJS-1817

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

1 Like