Cells.FileFormatUtil.DetectFileFormat Error

Hello,
DetectFileFormat will not detect XHtml files, test:
xhtml.zip (2.0 KB)

Visio vsd files are detected as unknown:
visio.zip (271.9 KB)

Apple contacts exported in Cells.FileFormatType.Numbers/Numbers09/Numbers35 but Cells will never detect them:
Apple.zip (975.0 KB)

Sorry, but OneNote packages with .onepkg extension are not in the list of Cells.FileFormatType enums, it’s kinda zip under the hood, will you add it to the DetectFileFormat? :slight_smile:

@australian.dev.nerds,
We can reproduce the issues. Through testing, it was found that file type detection failed.
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): 
CELLSNET-53566: Unable to detect apple file type correctly
CELLSNET-53567: Unable to detect xhtml file type correctly
CELLSNET-53568: Support visio file type detection
CELLSNET-53571: Support OneNote file type detection

@australian.dev.nerds
These xht files are same as xhtml files in the zip though the extensions are different.
We only can detect all of them as xhtml.

@australian.dev.nerds,

We are pleased to inform you that the (above) tickets/issues are resolved. The enhancements/fixes will be included in the next release (Aspose.Cells 23.7) scheduled in the first half of July 2023. You will be notified when the next version is released.

1 Like

Hello, yep just like Mht and Mhtml, DetectFileFormat is not extension based, so all are Xhtml :slight_smile:

oops, surprising fast :slight_smile:

Another issue with Cells.FileFormatUtil.DetectFileFormat is that it does not detect plain Text files.
See my samples:
txt.zip (814 Bytes)
Aspose Words detect Text files perfectly, I’d share the code.

More cases:
xml.xml which is xml is not detected as xml
RTF.rtf which is RTF is detected as Json
txt.txt which is txt is detected as csv (to be honest, I think it’s kinda search for commas to detect csv, just that lol
xsn.xsn which is binary is detected as CSV
pls.pls text playlist is detected as Json
Markdown.md / md.md is not detected
files.zip (47.8 KB)

Final list:
3 Excel files
1 css
4 PowerPoint files
All made using latest Office 2021
All detected wrong!
Custom Office Templates.zip (1.7 MB)

@australian.dev.nerds
3 Excel files(dif,slk, prn) in your attached zip and 1 css file are text files, we can not detect the file format type from content. We have no plan to support it.
We will try to support detecting those PowerPoints file: CELLSNET-53580

We will check xsn.xsn.
All other files are text files, we can not get the correct type from the content. If you can pass the file name with correct extension, we can check extension to detect the file type.

1 Like

Hello, thanks :slight_smile:

First, I could not get text for any txt file at all, Words works fine, Cells never detect the real Text files.

xml.xml is text? The best way to detect xml files I found is to load the file into a XDocument, if no exception, it’s xml, more advanced? XmlReader with validate options.

Anyway, detection based on the extension is not accurate, and other Aspose products won’t rely on extension, and I compared with Words, Words was correct, no idea why not sharing the same code base!

Finally, for SpreadsheetML which extension is recommended? xlsx vs ods or else?

And for GraphChart files which extension? Thank you so much

@australian.dev.nerds
1,Xml.xml
Most xml files start with “<?xml”, but there is no such header in your attached xml file. We will look into whether we can detect it as xml file.
As a lightweight detector, we do not want to parse the whole file to check the file format.

2,SpreadsheetML
Its extentsion is xml. It is Excel 2003 xml data file.
3,GraphChart
It’s an internal file format for OleObject, not public file format. You can simply think its extension is bin.

@australian.dev.nerds
We only can detect those PowerPoint files as pptx or ppt.
It depends on the content that whether a PowerPoint file is a ppt or pps.
We have no plan to support it.
Please use Aspose.Slides to check more details about PowerPoint files.

@australian.dev.nerds
OneNote packages and xsn files are Microsoft Cabinet files.
Microsoft Cabinet file can not be simply unzipped with .Net SDK.
So we cannot check whether the files are xsn or one note .

The issues you have found earlier (filed as CELLSNET-53568,CELLSNET-53571,CELLSNET-53566,CELLSNET-53567) have been fixed in this update. This message was posted using Bugs notification tool by johnson.shi