Detect many compressed files as Text

Hello,
There’s a critical issue in Words detect file format.
To demonstrate it, I have attached a few compressed file samples.
Use them against Words detect file format and for all of them you will get their format as Text!
Arj, Zip, bz2, lzh and xz are all detected as Text.
I packed them all in a zip file:

Samples.zip (7.1 MB)

If one rely on the Words detect file format for loading or just for detecting formats, will be in trouble!
I agree that detecting Text file format is one of the hardest tasks in programming, but seems the internal logic for Text detection is very corrupted.
Please test and confirm?
Thanks.

@australian.dev.nerds
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-26201

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Hello
Since it’s in Status : Analysis Complete status, may I know the results? :slight_smile:

@australian.dev.nerds All of the provided files are detected as Encoded text by Word, like the following:

except of 2030 Task.zip, that cannot be opened in Word at all:

So Aspose.Words mimics Word behavior. Most likely the issue will be closed as Not a Bug.

Hello
Few things to consider, first, is the only purpose of detect file format to pass to load function?
Second, if Word has unexpected or incorrect behavior what happens?

Many file types will be detected as encoded text by Word, so detect file format will fall the developer to the wrong invalid situations by returning incorrect results?

Specially 2030 Task.zip cannot be opened by Word, but should detect file format report a zip file as text or Unknown?!

Just personal idea :slight_smile:

@australian.dev.nerds Thank you for your feedback. I have forwarded the information to our developers team.

1 Like

Hello,
Any news? :slight_smile:

@australian.dev.nerds Unfortunately, there are no additional news regarding this issue yet.

The issues you have found earlier (filed as WORDSNET-26201) have been fixed in this Aspose.Words for .NET 24.11 update also available on NuGet.