Is there a recommended way for handling ListObject.ConvertToRange on gigantic tables?

Hi all,

Is there a pattern recommended for dealing with the process of flattening extraordinarily large tables? We’re seeing an edge case pop up where some of our use cases have Excels with billions of rows of data (a lot empty, accidentally created that large). Based on tests it looks like ListObject.ConvertToRange() has an edge case for dealing with maximum size tables–but these tables are under that limit, so it tries to load all the cells into memory and runs out of memory eventually while attempting to.

We have a couple potential solutions to deal with this on our side-- Resize the table to only contain cells that have data in them and then flatten it, don’t flatten tables over a certain number of cells (potential other implications)

What does ListObject.ConvertToRange do behind the scenes to handle this?

Thanks,
Brian

@bvk,

Thanks for your query.

Could you please share code sample with template and output file which has the pop up edge. You can reduce the table with MaxDataRow and MaxDataColumn properties while performing any operation to avoid extra memory consumption. Moreover, ConvertToRange method converts the table to a range. Please share the data so that we can help you if problem persists.

@ahsaniqbalsidiqui,

Thanks for the quick response! We might try resizing the table using MaxDataRow and MaxDataColumn as well as some logic to check if the headers are auto-generated for the Column (just via name, then checking if anything in that column has a value?) before converting it to a range of cells.

I have attached a zip of a xlsx that mimics the behaviour we’re seeing. the table is 1.02 million rows (just short of the max excel table size of ~1.04mil rows) tall along with being max width wide. Based on some very rough estimations about 1 TB of memory would be needed to convert this table to range.

I also noticed in this case there’s an issue with int overflow on table.DataRange.CellCount–the maximum number of cells is upwards of 16 billion. That variable might be better suited as a long :slight_smile:

NearMaxTableSize.zip (543 KB)

Code to reproduce:

	var asposeLoadOptions = new LoadOptions(format.LoadFormat)
	{
		LoadFilter = new LoadFilter(LoadDataFilterOptions.All);
	};

	var workbook = new Workbook("NearMaxTableSize.xlsx", asposeLoadOptions);

	foreach (var workbookWorksheet in workbook.Worksheets)
	{
		foreach (var listObject in workbookWorksheet.ListObjects.ToList())
		{
			listObject.ConvertToRange();
		}
	}

@bvk,

We are analysing this issue and will get back to you soon to provide our feedback.

@bvk,

We plan to provide an overload method listObject.ConvertToRange(int maxRowIndex) which will let you control last row of the range for such huge tables. We have logged it in our database for investigation and for a fix. Once, we will have some news for you, we will update you in this topic.

This issue has been logged as

CELLSNET-46616 - Handling ListObject.ConvertToRange on gigantic tables

@bvk,

This is to inform you that we have fixed your issue now. We will soon provide you the fixed version after performing QA and incorporating other enhancements and fixes.

@bvk,

Please try our latest version/fix: Aspose.Cells for .NET v19.2.4. We have added ConvertToRange(TableToRangeOptions options) method to set last row index when converting table to range.

Aspose.Cells19.2.4 For .Net2_AuthenticodeSigned.Zip (4.8 MB)
Aspose.Cells19.2.4 For .Net4.0.Zip (4.8 MB)

Let us know your feedback.

@ahsaniqbalsidiqui,

Thanks so much, that fixes it for us! (even without using ConvertToRange(TableToRangeOptions options))

@bvk
Good to know that your issue is sorted out by the suggested hotfix. Feel free to contact us any time if you need further help or have some other issue or queries, we will be happy to assist you soon.

The issues you have found earlier (filed as CELLSNET-46616) have been fixed in Aspose.Cells for .NET v19.3. This message was posted using BugNotificationTool from Downloads module by Amjad_Sahi