Aspose Cell DLL issue for XSLB file

Currently I am using Aspose.cell dll for my project. We are loading the Super Mega size (139MB) excel file. The file has following feature

  1. It is in XLSB format
  2. It has 5 sheets in it - 2 tab for Input, 2 Tab for calculating something with help of Formula(complex calculation) , one output tab.
  3. Through C# code, 2 input receive data
  4. Then “Calculate now” of the formula is triggered through code
    5 It generates data for output tab

Through Aspose.Cell component, we are loading it to Aspose Workbook. It is always throwing System out of memory exception.

Prior to this project, system was already processing similar kind of file whose size was around 50MB.

Also when I tried to open the excel file by double clicking the file, It was not opening for 32 bit office, but able to open 64 bit office. Also noticed that task manager showing it took 2.7 GB of space.

though I am able to load same file using MS office Interop Excel component, but that will bring huge change for me as the Aspose code is already in place. I also tried to add below code, but same out of memory error is throwing.

            LoadOptions loadOptions = new LoadOptions(LoadFormat.Xlsb);
            LoadDataOption dataOption = new LoadDataOption();
            dataOption.SheetIndexes = new int[] { 0 };
            dataOption.ImportFormula = true;
            loadOptions.LoadDataAndFormatting = true;
            loadOptions.LoadDataOptions = dataOption;
            loadOptions.MemorySetting = MemorySetting.MemoryPreference;
            Workbook workbook = new Workbook(CalcFileName, loadOptions);

The last line is throwing error. Can you please help me to figure out proper code to process this file.

@vicage,

Thanks for using Aspose APIs.

If I am not mistaken, Microsoft Excel could not load your Excel file. It means, your Excel file needs huge amount of memory and processing power.

If Microsoft Excel can open your Excel file, then check the last cell of your worksheets by pressing Ctrl + End. The last cell will give you an idea how many cells are present in your every worksheet.

If there are so many cells e.g. if last cell is like XEP1048564, then it means, huge amount of processing power and memory is needed to process this worksheet.

Lastly, please share your Excel file so that we could analyze it at our end and help you out. You can upload your Excel file in Dropbox or Google Drive etc. and then share the download link here.

No, MS excel is able to open the file with 64 bits setting.

On Ctrl+end, it is not giving last cell like XEP1048564. It is showing R10331 for one input tab.
and O20401 for other input tab. It will be not possible to share the excel file due to confidentiality. As I said earlier that Interop was working good with that, but avoiding to use it as I have to add a lot new code entirely. One point might help you to analyze that Behing tha scene, interop code is opening the EXCEL.exe (get to know through Task Manager) and expand up to the same size of excel (2.7 GB). But The aspose component is not showing any such grow, not sure what is going on here. if it is related to 32 bit and 64 bit.

@vicage

This issue might also be occurring because you are using some older version. Please download and try the most recent version and see if it makes any difference in resolution of this issue. Let us know your feedback.

Can you please provide the proper solution in form of code snippet. After installing the latest version from NuGet, I sill see out of memory exception. Also now load data and formatting is obsolete function.

LoadOptions loadOptions = new LoadOptions(LoadFormat.Xlsb);

            LoadDataOption dataOption = new LoadDataOption();
            dataOption.SheetIndexes = new int[] { 0 };
            dataOption.ImportFormula = true;
            LoadFilter lf = new LoadFilter();
            loadOptions.LoadDataAndFormatting = true;
            loadOptions.LoadDataOptions = dataOption;
            loadOptions.MemorySetting = MemorySetting.MemoryPreference;
            Workbook workbook = new Workbook(CalcFileName, loadOptions);

Can you check one more point. I noticed that there is no separate EXCEL.exe opened in task manager when working with ASPOSE. Why? where as if I use office.Interop, I see separate EXCEL.exe got open during loading and saving. once that is complete, it gets terminated. As per MSDN site, we should not load too much on webdev.webserver.exe. Please see below link

https://social.msdn.microsoft.com/Forums/vstudio/en-US/0f40f168-b71d-4c0e-896f-63b5e01e0221/webdevwebserverexe-outofmemory?forum=csharpgeneral

@vicage

Please share some sample console application project along with sample Excel file as it is difficult for us to suggest you any workaround or solution without trying it at our end.

Aspose.Cells is an independent library and it does not depend on Microsoft Excel. It means, it has nothing to do with Microsoft Excel. You can use Aspose.Cells on computers that do not have Microsoft Excel installed.

FYI:
You can also detect the file format of your Excel file on runtime. Sometimes, the file format does not match with its extension. For example, if you create XLSX file and rename it to ABC.TXT, then users will consider them as .txt file not as .xlsx file. But in reality it is not a txt file but it is an xlsx file.

Please use FileFormatUtil.DetectFileFormat() method to detect the format of file on runtime.

Article Link:

@vicage

For your code:

  1. We are afraid LoadDataOption and LoadFilter are not supported for XLSB file format currently.

  2. If your file contains only simple cell data (including formulas) without large dataset for other objects, such as different styles, shapes, …etc., we are afraid LoadDataOption and LoadFilter will not help much. Otherwise, if certain amount of memory is occupied by those objects in the template file, we may consider to support to filter them while loading the file.

  3. Because you need to change input data (cell values) and then calculate formulas, loading only one worksheet (dataOption.SheetIndexes = new int[] { 0 }) is not an available option either.

So, for you the code is just simple as:

loadOptions.MemorySetting = MemorySetting.MemoryPreference;

Workbook workbook = new Workbook(CalcFileName, loadOptions);

Because the template file is so large, it is sure that lots of memory will be required to load it. You may try the solution suggested in this post

RecyclableMemoryStreamManager

to see whether it can make difference for the memory issue.

I tried “Out of Memory exception solution” as mentioned above, but it use LightCells API due to which it is filtering out the formula. It should include the formula as is. Below are the code done
class MemoryOptimizer : LightCellsDataHandler
{
private int cellCount;
private int formulaCount;
private int stringCount;

    internal MemoryOptimizer()
    {
        cellCount = 0;
        formulaCount = 0;
        stringCount = 0;
    }

    public int CellCount
    {
        get { return cellCount; }
    }

    public int FormulaCount
    {
        get { return formulaCount; }
    }

    public int StringCount
    {
        get { return stringCount; }
    }

    public bool StartSheet(Worksheet sheet)
    {
        return true;
    }

    public bool StartRow(int rowIndex)
    {
        return true;
    }

    public bool ProcessRow(Row row)
    {
        return true;
    }

    public bool StartCell(int column)
    {
        return true;
    }

    public bool ProcessCell(Cell cell)
    {
        cellCount++;
        if (cell.IsFormula)
        {
            formulaCount++;
        }
        else if (cell.Type == CellValueType.IsString)
        {
            stringCount++;
        }
        return false;
    }
}
class MemMgtF : CustomImplementationFactory
{

    RecyclableMemoryStreamManager manager = new RecyclableMemoryStreamManager();

    public override MemoryStream CreateMemoryStream()
    {

        return manager.GetStream("MemMgtF");
    }

    public override MemoryStream CreateMemoryStream(int capacity)
    {
        return manager.GetStream("MemMgtF", capacity);
    }
}

public Workbook Workbook
{
get { return workbook ?? (workbook = new Workbook(this.ExcelFilePath, new LoadOptions(this.LoadFormat) { MemorySetting = MemorySetting.MemoryPreference, LightCellsDataHandler = new MemoryOptimizer() })); }
}

@vicage

Because you mentioned in your first post that you need to change input value and then calculate formulas, we don’t think LightCells API is the proper solution for you. We only recommend you to try the solution part “RecyclableMemoryStreamManager

Post Reference:

Out of Memory Exception - System.IO.MemoryStream - #15 by shakeel.faiz

Please send us your template file so we can make further investigation. And we need to know how much available memory can be used in your environment for your application to load the template file.

I did not find the relationship between CellsHelper.CustomImplementationFactory and workbook. Can you please explain how Class MM get triggered through instantiating workbook

CellsHelper.CustomImplementationFactory = new MM();

get { return workbook ?? (workbook = new Workbook(this.ExcelFilePath, new LoadOptions(this.LoadFormat) { MemorySetting = MemorySetting.MemoryPreference})); }

class MM : CustomImplementationFactory
{

RecyclableMemoryStreamManager manager = new RecyclableMemoryStreamManager();

public override MemoryStream CreateMemoryStream()
{

	return manager.GetStream("MM");
}

public override MemoryStream CreateMemoryStream(int capacity)
{
	return manager.GetStream("MM", capacity);
}

}

@vicage

Class MM has two methods. You will provide memory stream object in those two methods by yourself. RecyclableMemoryStreamManager is an implementation of memory stream object provided by some other party. You can provide your own implementation if you like.

Whenever Aspose.Cells needs MemoryStream object, it will get it from CellsHelper.CustomImplementationFactory which in turn is implemented by you.

Please read about it in this article.

I dont know where is the Gap between the solution provided and my concern. After doing some more research, let me share below points.

  1. The excel file is in XLXS format with size around 110 MB. When I suffixed .ZIP to the the end, the Worksheet within took around 1.3 GB.

  2. With ASPOSE, we are loading workbook, adding input records, calculating and adding output records. These entire process took approximately 70-80 mins and even sometime going out of Memory exception. So far solution does not help much.

Can you guide me

@vicage

Have you tried the RecyclableMemoryStreamManager? This is provided by Microsoft. The purpose of it that sometime system has memory available but the memory is not contiguous. In this case, the usual memory stream object does not work but the memory stream object provided by RecyclableMemoryStreamManager works.

Please give it a try and if it fails then provide us your Microsoft Visual Studio Project which we could run at our end and replicate this issue.

You can upload all of your files on DropBox or Google Drive etc. and provide us download links here.

I did use that. Please correct me if this is not the correct way. first I created the class as follow
class MemMgtF : CustomImplementationFactory
{

    RecyclableMemoryStreamManager manager = new RecyclableMemoryStreamManager();

    public override MemoryStream CreateMemoryStream()
    {

        return manager.GetStream("MemMgtF");
    }

    public override MemoryStream CreateMemoryStream(int capacity)
    {
        return manager.GetStream("MemMgtF", capacity);
    }
}

Then simply added one file of code in the method as shown below
public void CaclulateModelPL(DataSet dataModel)//, string CalcFileName
{

        if (dataModel != null && ExcelModel != null)
        {
            CellsHelper.CustomImplementationFactory = new MemMgtF();
            var workbook = this.ExcelModel.Workbook;
            if (workbook != null)
            {
                var ranges = ExcelModel.GetExcelModelRanges;
                if (ranges != null || !ranges.Any())
                {
                    base.ImportInputsTo(workbook, ranges, dataModel);
                    workbook.CalculateFormula(false);
                    base.ExportOutputsTo(workbook, ranges, dataModel);
                }
            }
        }

    }

I am confused with the way class is used. Is this correct way

@vicage

Code looks good. Please provide the sample console application project as well. It will be helpful for us to replicate and report the issue to product team.

Above code is the only we have. dataset contains here 2 table with approximately 20 columns in each table. Below is the sample formula present in each cells for output worksheets

=IFERROR(ROUND($W2*(1+INDEX(ABC!$C$5:$BX$26000,MATCH(ABC!$GS2,ABC!$D$5:$D$26000,0)+MATCH(XYZ!$D2,ABC!$C$5:$C$26000,0)-1,MATCH(“WI”,ABC!$C$2:$BX$2,0)+MATCH(XYZ!$E2,ABC!$BA$1:$BL$1,0)-1)),4),0)

@vicage

Thanks for considering Aspose APIs.

Please share one simplest Excel file which you could create manually. It should tell the relationship between your sheets. This will help us know the structure of your Excel file which may play a crucial role in resolving this issue. Thanks for your cooperation in this regard and have a good day.

I am using another version of xlsx file, this file (77MB) goods for loading excel worksheets, then calculate and finally when it is about to save the file in xlsb format, then it threw system out of memory exception.

The same piece of code is working fine when I am using another file bigger in size

@vicage

There is no good way to deal with Out of Memory Exception. However, Aspose.Cells have done some optimization for the Excel files that need lots of memory. Please see the following article for your more help.

You may also submit your sample project with your sample Excel file by zipping it, uploading it on Google Drive or Drop Box and then sharing the download link here.

Reference Link: