Encrypted XLSX file looses OLE package streams and storages

Hi,

Scenario: Create empty XLSX file in Excel and save it with password to open. Open and re-save just created file with Aspose.Cells.

Result: I noticed that

  1. After re-saving workbook by Aspose.Cells, resulting XLSX file (its OLE package) missing some streams and storages. See the screenshot that shows OLE package structure before and after resaving XLSX OLE package structure.png (5.9 KB). There is no ’ DataSpaces’ storage with all its sub-storages and streams.

    Please explain why ’ DataSpaces’ storage is removed?

  2. Application GUID is changed from Guid.Zero to “00020820-0000-0000-c000-000000000046”.

    MS Excel never sets application GUID for encrypted OOXML Word, Excel and PowerPoint formats. Moreover, “00020820-0000-0000-c000-000000000046” GUID corresponds to Excel.Sheet.8 (.xls extension), Excel.Template.8 (.xlt extension). See Excel.Sheet.8.png (30.0 KB)

    Aspose.Cells did not set Application GUID for encrypted XLSX files before. It is recent change.

Thanks,
Alex

@licenses,

Please share your sample files and code snippet with us for our testing. Make sure that you are using latest version of the API. Also mention the version of MS Excel you are using to create input file and how you are inspecting the missing DataSpaces. Share the steps then we will reproduce the problem and provide our feedback after analysis soon.

var w = new Workbook(@"d:\empty.xlsx",
           new Aspose.Cells.LoadOptions() { Password="1" });
w.Save(@"d:\empty-resaved.xlsx");

Sample file: Empty.zip (9.6 KB)

@licenses,

Thank you for providing sample data.

We tried to analyse this information but still have some queries to understand the issue.

  1. You have talked about missing storages and streams in the OLE package. We are not able to observe this OLE package anywhere. Please let us know if this OLE package can be seen in Excel or not. Please provide us detail about viewing this OLE package structure in template file and output file.

  2. Regarding the GUID issue, please provide us details what to observe in registry before running the program and what to observe after running the program. Also please share with images and more details that what should be the actual behaviour in the registry and steps to observe that behaviour using Excel only.

  3. Regarding setting Application GUID for encrypted XLSX file, you mentioned that this issue was not there earlier. Please share the Aspose.Cells version where it was working fine earlier and steps to observe this difference using that old version without this issue and the newer version having this issue. We need to know whether this issue of change in application GUID is faced with encrypted files only or it is there for normal files (unencrypted) as well.

  4. Is there any mechanism other than registry to observe these problems?

You may please provide us detailed information along with the expected outputs for our analysis. We will review the feedback and provide our comments.

  1. You have talked about missing storages and streams in the OLE package. We are not able to observe this OLE package anywhere. Please let us know if this OLE package can be seen in Excel or not. Please provide us detail about viewing this OLE package structure in template file and output file.

OLE package is not inside sample file. File it self is OLE package (or compound file see Compound File Binary File Format. OOXML files are ZIP archives. Legacy and encrypted OOXML formats are compound files (OLE structured storage). Developers will now what I am talking about.

Being a compound file, sample file has storages and streams that are sown on the screenshot from my initial post. I get storage structure visualization using API that works with compound files. You can use something like OpenMCDF. After resaving sample file, it loses ’ DataSpaces’ storage.

So I asked to explain why ’ DataSpaces ’ storage is removed?

  1. Regarding the GUID issue, please provide us details what to observe in registry before running the program and what to observe after running the program. Also please share with images and more details that what should be the actual behavior in the registry and steps to observe that behavior using Excel only.
  2. Regarding setting Application GUID for encrypted XLSX file, you mentioned that this issue was not there earlier. Please share the Aspose.Cells version where it was working fine earlier and steps to observe this difference using that old version without this issue and the newer version having this issue. We need to know whether this issue of change in application GUID is faced with encrypted files only or it is there for normal files (unencrypted) as well.

Sorry, seems it is my mistake. Aspose.Cells was always (or at least for very long time) inserting non-Zero Appplication GUID, and removing ’ DataSpaces’ storage while resaving encrypted XLSX, XLSM…

About the Windows Registry: It does not matter when to open Windows Registry. I used it to show that Application GUID that is inserted by Aspose.Cells for encrypted OOXML formats is registered for XLS format.

GUID for .XLSX is {00020830-0000-0000-C000-000000000046}, for XLSM is {00020833-0000-0000-C000-000000000046}. I have taken it from the Registry too.

MS Word/Excel/PowerPoint always puts Zero Application GUID for encrypted OOXML formats. If you resave sample file by Aspose.Celss, then resave by MS Excel, then MS Excel replaces GUID inserted by Aspose.Cells to Zero GUID.


Expected behavior for Application GUIDs: I do not know what expected behavior is, because I cannot find written confirmation in MS documentation. I just shared my finding with you that MS applications set Zero GUIDs for encrypted OOXML files.

@licenses,

We were able to observe the issue but we need to look into it more. We have logged the issue in our database for investigation and for a fix. Once, we will have some news for you, we will update you in this topic.

These issues have been logged as

CELLSNET-46332 - Storages and streams missing from OLE Package after resaving an encrypted XLSX file
CELLSNET-46333 - Application GUID is changed from Guid.Zero to "00020820-0000-0000-c000-000000000046"

Thank you.

Please let me know if you have new questions.

@licenses,

We will let you know if any other information will be needed, or once we will have some news for you, we will update you in this topic.

@licenses,

This is to inform you that we have fixed your issue (logged earlier as “CELLSNET-46332” and “CELLSNET-46333”) now. We will soon provide you the fixed version after performing QA and incorporating other enhancements and fixes.

@licenses,

Please try our latest version/fix: Aspose.Cells for .NET v18.8.8 (attached)

Your issues should be fixed in it.

Let us know your feedback.

Aspose.Cells18.8.8 For .Net2_AuthenticodeSigned.Zip (4.6 MB)
Aspose.Cells18.8.8 For .Net4.0.Zip (4.6 MB)

The issues you have found earlier (filed as CELLSNET-46333,CELLSNET-46332) have been fixed in Aspose.Cells for .NET v18.9. This message was posted using BugNotificationTool from Downloads module by Amjad_Sahi

Hi,

Tested on Aspose.Cells 18.9.

  1. Zero GUID is not changed - OK
  2. DataSpaces storage is not removed - OK

Can you please share the following information:

  • What is kept in the DataSpaces storage?
  • Why DataSpaces storage was removed before? Was it a bug or by design? If it was by design, what was the reason of not retaining DataSpaces storage.

Thanks.

@licenses,

Good to know that your issues are sorted out by the new version/fix.

For your queries:

  1. We keep the DataSpaces storage of the template file and save it back.

  2. We could read the data of encrypted XLSX file with EncryptionInfo and EncryptedPackage stream and it seems all other data of the file could be ignored, so we did not keep them.