Problem with Double Precision (how to get the "raw" value of a cell?)

ernestskarlsonslayer · April 16, 2020, 1:24pm

Hello!

I’m trying to get the “raw value” as it is stored in the XLSX XML for a numeric cell that has digits beyond the 14 significant places. I understand that Aspose is trying to mimic Excel by using the “double” (IEEE 754) precision. And for 99% cases, it works well. However, not always that level of precision is satisfactory. Moreover, calculations performed by Excel still use full precision, only the representation is limited to IEEE 754.

Look at the code below. It uses an XLSX document (test.xlsx.zip) with two cells:

B1 contains numeric value 770.83200000000102
C1 contains a formula =B1*1000000000000000

You will see that, due to rounding, the values given by Aspose are not congruent.
The scientific representation of C1 is not a problem. However, the rounded value of B1 is.
For comparison, the output of POI below.

import com.aspose.cells.*
import org.apache.poi.ss.util.CellReference
import org.apache.poi.xssf.usermodel.XSSFCell
import org.apache.poi.xssf.usermodel.XSSFSheet
import org.apache.poi.xssf.usermodel.XSSFWorkbook

// Aspose

var opts = TxtLoadOptions(LoadFormat.XLSX)
opts.convertNumericData = false
opts.keepPrecision = true
var workbook = Workbook("test.xlsx", opts)
var worksheet: Worksheet = workbook.worksheets[0]

var asposeCell1: Cell = worksheet.cells["B1"]
asposeCell1.stringValueWithoutFormat // 770.832000000001

var asposeCell2: Cell = worksheet.cells["C1"]
asposeCell2.stringValueWithoutFormat // 7.7083200000000102E17

// POI

var wb = XSSFWorkbook("test.xlsx")
var sheet: XSSFSheet = wb.getSheetAt(0)

var cellReference1: CellReference = CellReference("B1")
var poiCell1: XSSFCell = sheet.getRow(cellReference1.row).getCell(cellReference1.col.toInt())
poiCell1.rawValue // 770.83200000000102

var cellReference2: CellReference = CellReference("C1")
var poiCell2: XSSFCell = sheet.getRow(cellReference2.row).getCell(cellReference2.col.toInt())
poiCell2.rawValue // 7.7083200000000102E+17

Despite the multiple posts on this issue, i haven’t been able to find a solution that would work for me.
I hope somebody can point me in the direction of getting the raw value for a numeric cell.

amjad.sahi · April 16, 2020, 2:23pm

@ernestskarlsonslayer,

Thanks for the template file, sample code and details.

After an initial test, I am able to reproduce the issue as you mentioned by using the following sample code with your template file. I found the issue with double precision numbers when trying to get raw value stored in XLSX source XML. It is not retrieved exactly as per the source XML data of the file:
e.g
Sample code:

LoadOptions opts = new LoadOptions(LoadFormat.XLSX);

        Workbook workbook = new Workbook("e:\\test2\\test.xlsx", opts);
        Worksheet worksheet = workbook.getWorksheets().get(0);

        Cell asposeCell1 = worksheet.getCells().get("B1");
        System.out.println(asposeCell1.getStringValueWithoutFormat()); // 770.832000000001

        Cell asposeCell2 = worksheet.getCells().get("C1");
        System.out.println(asposeCell2.getStringValueWithoutFormat());

I have logged a ticket with an id “CELLSJAVA-43167” for your issue. We will look into it soon.

Once we have an update on it, we will let you know.

ahsaniqbalsidiqui · April 17, 2020, 7:26am

@ernestskarlsonslayer,
Can you make a double variable with value “770.83200000000102” by any means in java program? By our test for java program such one value will be taken as 770.832000000001 always.

ernestskarlsonslayer · April 17, 2020, 4:14pm

Hi @ahsaniqbalsidiqui,

Yes, a simple putValue call sets the proper value in the cell as displayed by the program below.
You can see that the calculation is done right, even though stringValueWithoutFormat shows only the rounded value. It mimics the behaviour of Excel in working with full precision in calculations but using double precision in representation. However, for my use case where i want to make an exact replica of a part of the spreadsheet, it’s not enough.

var newWorkbook = Workbook()
var newWorksheet: Worksheet = newWorkbook.worksheets[0]

var newCell1: Cell = newWorksheet.cells["B1"]
newCell1.putValue(770.83200000000102)
newCell1.stringValueWithoutFormat // 770.832000000001

var newCell2: Cell = newWorksheet.cells["C1"]
newCell2.formula = "=B1*1000000000000000"

newWorkbook.calculateFormula()
newCell2.stringValueWithoutFormat // 7.7083200000000102E17

ahsaniqbalsidiqui · April 17, 2020, 5:54pm

@ernestskarlsonslayer,
Thank you for your feedback. We are analyzing this information and will share our feedback soon.

ahsaniqbalsidiqui · April 20, 2020, 1:55pm

@ernestskarlsonslayer,

Please note that a double variable with value 770.83200000000102 in java program does not means setting it to a cell by our component but means a variable in common java program, such as:

double v = 770.83200000000102;

You will find you cannot get the exact value 770.83200000000102 for “v”, its value becomes 770.832000000001 instead. And the value 7.7083200000000102E17 has nothing to do with the calculation either. Even if you assign 7.70832000000001E17 to v, you will get 7.7083200000000102E17 for it after the assignment.

ernestskarlsonslayer · April 21, 2020, 7:14am

Ok, thank you for the explanation. It’s an unpleasant surprise to me, but i indeed was able to verify it is like you say.

Regardless, the question of how to get the actual string stored in the Excel file XML is open.

amjad.sahi · April 21, 2020, 9:26am

@ernestskarlsonslayer,

You are welcome. And good to know that you have verified it on your end.

ernestskarlsonslayer · April 23, 2020, 7:25am

@Amjad_Sahi, will you still investigate how to get the value stored inthe XML file? For us it’s a final missing factor before we can complete migration from POI to Aspose in our application.

amjad.sahi · April 23, 2020, 10:01am

@ernestskarlsonslayer,

We evaluated it a bit. I am afraid we might not support the original precision by other type of data, such as string. As you know, we have many users and reported cases that require better performance (i.e., decreased memory cost). If we store such kind of data for cells in Aspose.Cells model, more trouble will be brought to our component for the memory performance. Performance is the key which should not be compromised.

Anyways, we will do further investigation and update you once any new info is available.

ernestskarlsonslayer · April 23, 2020, 10:15am

Thank you for explaining the implications. Aspose for us is a huge improvement in memory consumption over POI (we observe on average 10x less memory required on most use cases). However, it would be very painful for us not to have access ot the original string, as then the only way would be to open the XML diretly and read it from there.

amjad.sahi · April 23, 2020, 11:26am

@ernestskarlsonslayer,

We understand your situation but we do not compromise performance in most cases. Anyways, we will still check (when we find time) and evaluate your scenario if we could find some solution or workaround for you.

Once we have an update on it, we will let you know.