Memory Consumption goes weird when using Light Cells API

Hi,

We’re conducting some performance test in our application using Light Cells API. Based on our test results, the direction of the memory consumption when the rows increased seems to be weird.
We compare this also using Aspose Standard to check if there is something wrong in our testing.

Here’s the test results of Using Aspose Standard and Light Cells API

image.png (28.5 KB)

If you look at the results, the memory consumption in Aspose Standard test results is getting higher when the row to be saved is increased. Seems to be normal. While in Lightcells Test Results, there is no specific direction of progress on the memory consumption when the rows is increased and if you look on 100K to 500k rows, the memory consumption goes down.

Just wanna ask how the memory consumption works in Light Cells API. Is the memory consumption in Light Cells API is supposed to be like that? Or it should grow when the rows is increasing? Just wanna confirm this to check that if we’re getting wrong data in our testing.

By the way we use Windows Performance Monitor and check the Private Bytes of our application.

Here’s our sample code:

namespace PerfTest {
	class Program {
		static void Main(string[] args) {
			DateTime now = DateTime.Now;
			DataTable testDataTable = new DataTable();
			DataRow dataRow;
			testDataTable.Columns.Add("TEST_STRING_COLUMN1");
			testDataTable.Columns.Add("TEST_STRING_COLUMN2");
			testDataTable.Columns.Add("TEST_DATE_COLUMN", typeof(DateTime));

			// Change the value of this variable to test saving in different number of rows to save.
			int totalRow = 250000;

			for(int currentRowIndex = 0; currentRowIndex < totalRow; currentRowIndex++) {
				dataRow = testDataTable.NewRow();
				dataRow[0] = "Test String 1" + currentRowIndex;
				dataRow[1] = "Test String 2" + currentRowIndex;
				dataRow[2] = DateTime.Now;
				testDataTable.Rows.Add(dataRow);
			}

			Console.WriteLine("Data loaded. Press any key to save.");
			Console.ReadKey();
			Console.WriteLine("Saving...");

			// Save Data Table.
			SaveResultsUsingStandardAspose(testDataTable, @"C:\Test\Test.xlsx");

			// Uncomment this to test the memory consumption for Light Cells Approach
			//SaveResultsUsingLightCells(testDataTable, @"C:\Test\Test.xlsx");

			Console.WriteLine("Data saved. Press any key to exit.");
			Console.ReadKey();

		}

		private static void SaveResultsUsingStandardAspose(DataTable dataTable, string fileFullPath) {
			string fileExtension = Path.GetExtension(fileFullPath);
			int maxRow;
			Workbook workBook;
			object value;
			Style style;
			DataRow row;
			Cell cell;
			Cells cells;

			workBook = new Workbook();
			cells = workBook.Worksheets[0].Cells;

			if (dataTable.Rows.Count > 1048575)
				maxRow = 1048575;
			else
				maxRow = dataTable.Rows.Count;

			// Initialize column
			for (int columnIndex = 0; columnIndex < dataTable.Columns.Count; columnIndex++) {
				cells[0, columnIndex].PutValue(dataTable.Columns[columnIndex].ColumnName);
			}

			// Insert rows
			for (int rowIndex = 0; rowIndex < maxRow; rowIndex++) {
				row = dataTable.Rows[rowIndex];
				for (int columnIndex = 0; columnIndex < dataTable.Columns.Count; columnIndex++) {
					value = row[columnIndex];
					cell = cells[rowIndex + 1, columnIndex];
					if (value is DateTime) {
						cell.PutValue((DateTime)value);
						style = cell.GetStyle();
						style.Custom = "mm/dd/yyyy hh:mm:ss AM/PM";
						cell.SetStyle(style);
					}
					else {
						cell.PutValue(value);
					}
				}
			}

			// Save the workbook
			workBook.Save(fileFullPath);
		}

		private static void SaveResultsUsingLightCells(DataTable dataTable, string fileFullPath) {
			int maxRow;
			SaveOptions options;
			Workbook workBook;

			if (dataTable.Rows.Count > 1048575)
				maxRow = 1048575;
			else
				maxRow = dataTable.Rows.Count;

			options = new OoxmlSaveOptions(SaveFormat.Xlsx);
			((OoxmlSaveOptions)options).LightCellsDataProvider = new TestLightCellsImplementation(dataTable, maxRow);

			workBook = new Workbook();
			workBook.Save(fileFullPath, options);
		}
		class TestLightCellsImplementation : LightCellsDataProvider {

			// Fields
			#region Fields
			private int _currentColumnIndex = -1;
			private int _currentRowIndex = -1;
			private DataTable _dataTable;
			private List<Type> _dataTableColumnDataTypes;
			private int _maxColumns;
			private int _maxRows;
			#endregion

			public TestLightCellsImplementation(DataTable dataTable, int maxRow) {
				_maxRows = maxRow;
				_dataTable = dataTable;
				_maxColumns = dataTable.Columns.Count;
			}

			#region LightCellsDataProvider Members

			#region IsGatherString

			public bool IsGatherString() {
				return false;
			}
			#endregion
			#region NextCell

			public int NextCell() {
				_currentColumnIndex++;
				if (_currentColumnIndex < _maxColumns) {
					return _currentColumnIndex;
				}
				else {
					return -1;
				}
			}
			#endregion
			#region NextRow

			public int NextRow() {
				_currentRowIndex++;
				if (_currentRowIndex <= _maxRows) {
					_currentColumnIndex = -1;
					return _currentRowIndex;
				}
				else {
					return -1;
				}
			}
			#endregion
			#region StartCell

			public void StartCell(Cell cell) {
				Style cellStyle;
				object cellValue;
				Type columnDataType;

				if (_currentRowIndex == 0) {
					if (_dataTableColumnDataTypes == null) {
						_dataTableColumnDataTypes = new List<Type>();
					}

					cell.PutValue(_dataTable.Columns[_currentColumnIndex].ColumnName);
					_dataTableColumnDataTypes.Add(_dataTable.Columns[_currentColumnIndex].DataType);
				}
				else {
					columnDataType = _dataTableColumnDataTypes[_currentColumnIndex];
					cellValue = _dataTable.Rows[_currentRowIndex - 1][_currentColumnIndex];
					if (columnDataType == typeof(DateTime)) {

						if (cellValue == null || cellValue is DBNull)
							cell.PutValue(null);
						else
							cell.PutValue((DateTime)cellValue);

						// Format the current cell with specified date format.
						cellStyle = cell.GetStyle();
						cellStyle.Custom = "mm/dd/yyyy hh:mm:ss AM/PM";
						cell.SetStyle(cellStyle);
					}
					else {
						cell.PutValue(cellValue);
					}
				}
			}
			#endregion
			#region StartRow

			public void StartRow(Row row) {
			}
			#endregion
			#region StartSheet

			public bool StartSheet(int sheetIndex) {
				return true;
			}
			#endregion

			#endregion
		}


	}
}

Let me know if you have any clarifications.
Looking forward for your response.
Thanks!

@jsaez,

Thank you for providing the stats/results screenshot and details.

The memory consumption behavior you’re observing in Aspose.Cells’ LightCells API is indeed different from the standard mode, and it’s by design.

In standard mode, Aspose.Cells holds the entire spreadsheet in memory, which means that as you increase the number of rows, the memory consumption grows proportionally because all the data is loaded into memory at once.

However, the LightCells API works differently. It doesn’t load the entire spreadsheet into memory at once. Instead, it processes the spreadsheet in chunks or sections, which significantly reduces memory usage, especially for large spreadsheets.

The behavior you’re seeing, where memory consumption may decrease as the number of rows increases, is not uncommon with the LightCells API. This is because the LightCells API optimizes memory usage by processing data in smaller, more manageable portions. So, as the number of rows increases, the memory consumption may not increase linearly or may even decrease in some cases due to the efficiency of this processing method.

It’s essential to understand that this behavior is intentional and part of the design of the LightCells API. So, you don’t necessarily need to be concerned about it being due to the data you’re using.

However, if you have specific concerns about your data or performance, please feel free to write us back for assistance and clarification.