I want to extract metadata from OFFICE documents (docx, pptx, xlsx) based on the Inspection and Sanitization Guidance (OFFICE 2007.4.8, OFFICE 2007.4.9, OFFICE 2007.4.10).
I created a .Net 6 project to extract these properties but some of the fields are not extracted. For example:
Aspoes.Words:
ScaleCrop (bool)
LinksUpToDate (bool)
SharedDoc (bool)
HiperLinksChanged (bool)
AppVersion (float) Maybe this is the “Version“ field?
For Aspose.Cells, I tested your scenario/case using your template XLSX file and sample code snippet with Aspose.Cells v24.6. Here is the console output I got:
--- Cells Builtin Properties e:\test2\extracting office documents\Metadata.xlsx ---
Title Title
Subject Subject
Author Windows User
Keywords Keywords
Comments Comments
LastSavedBy Windows User
CreateTime 6/21/2024 11:55:52 AM
LastSavedTime 7/1/2024 9:54:26 AM
Category Category
NameOfApplication Microsoft Excel
Security 0
ScaleCrop False
Manager Manager
Company Company
LinksUpToDate False
SharedDoc False
HyperlinkBase https://www.google.com/
HyperlinksChanged False
Version 16.0300
--- Cells Custom Properties e:\test2\extracting office documents\Metadata.xlsx ---
Text Text
Number 1234
Bool1 True
Bool2 False
Date 1/1/2024 10:00:00 AM
I also evaluated your template XLSX file by opening the file into MS Excel 2010 and 2019 but I could not spot/find your mentioned properties (core and custom, etc.). See the screenshots attached for your reference. sc_shot1.png (8.4 KB) sc_shot2.png (9.4 KB)
How could I view/get those missing attributes/properties in MS Excel manually?
@erdeiga,
As for Aspose.Slides, we have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): SLIDESNET-44626
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
For Aspose.Cells, we have opened the following new ticket(s) in our internal issue tracking system to evaluate and investigate the extraction of HeadingPairs (binary) and TitlesOfParts (string) properties. We will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): CELLSNET-56107
Once we have an update on the ticket, we will inform you here.
TitlesOfParts (string)
These two properties cache the number and name of worksheets in the file, and they are duplicated from the settings of worksheets in the file.
To avoid maintaining two sets of data, we did not read in these two attributes.
And you get them as the following:
HeadingPairs (binary) : WorksheetCollection.Count
TitlesOfParts (string): iterate all sheets in the WorksheetCollection to their names.