Words.FileFormatUtil.DetectFileFormat Error

Hello,
There are some files that Cells.FileFormatUtil.DetectFileFormat will detect them correctly, but Words.FileFormatUtil.DetectFileFormat is returning wrong results:

samples.zip (2.3 MB)

filelist.xml is detected as Text, wrong, it is xml

image.svg is detected as xml, wrong, it is svg

image.emz and image.wmz and image.svgz is detected as Text, wrong, it is emz which is Gzip indeed, anything but txt

oga.zip which is zip is detected as text

Thanks

I also exported some numbers from my Apple Id web and Words DetectFileFormat will not detect them at all:
Apple.zip (975.0 KB)

And all jxl images are detected as Text, open them in Notepad, can’t be Text, never ever!

images.zip (342.2 KB)

ZIP file also detected as Text:
2030 Task.zip (5.4 KB)

Final more files, with different binary contents and extensions, all detected as Text:
more.zip (7.3 MB)

@australian.dev.nerds
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-25552,WORDSNET-25553,WORDSNET-25554,WORDSNET-25555,WORDSNET-25556,WORDSNET-25557

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Regarding SVG, since it is not supported as a load format by Aspose.Words, it properly detects it as XML, since SVG is actually XML document.

1 Like

Thanks, last one is this xml which is detected as Text:
filelist.zip (255 Bytes)

The best way to detect xml files I found is to load the file into a XDocument, if no exception, it’s xml, more advanced? XmlReader with validate options.

@australian.dev.nerds Thank you for additional information. I have added the file to WORDSNET-25552 defect. We will keep you updated and let you know once the issues are resolved.

1 Like

Hello,
Sorry just some more samples, all js but detected as html and markdown.
JS.zip (43.1 KB)

@australian.dev.nerds
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-25562

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Hello
One more item, if you don’t mind please, many Rar archives are detected as TEXT:
Could not attach .rar so zipped it:
Logic Gates.zip (794 Bytes)

@australian.dev.nerds
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-25614

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

The issues you have found earlier (filed as WORDSNET-25552) have been fixed in this Aspose.Words for .NET 24.4 update also available on NuGet.

@australian.dev.nerds The issue WORDSNET-25552 has been closed as Won't Fix. For the moment this is an expected AW behavior. The attached files have single element <xml> inside. The node type of this element is Element , but not XmlDeclaration . MS Word detects such content as XML , if file extension is *.xml and as Text otherwise.