Gzip vs Tar.Gz

Hello
Aspose Email has a FileFormat Tgz which is Zimbra storage, indeed it’s tar.gz
Office Mapi has a format .emz which is Gzip with .EMF file inside

Aspose Email will not distinguish between these 2: tgz and emz
Since Aspose Zip is specialized in this field, can it detect the difference somehow? :slight_smile:

sample files:
samples.zip (63.3 KB)

It seems it can’t. It is the same situation with *.docx and *.zip: DOCX is a regular ZIP internally. Aspose.ZIP does not analyze file based on its content and extension, but you can do it. :slight_smile:

1 Like

Thanks, never trust extension :smiley:

OK anyway, please kindly advise how to extract .emz files, to disc and to memory stream?
A sample of .emz file:
sample.zip (59.5 KB)

Dim MyMemoryStream = New MemoryStream
Using MyFileStream As New FileStream(“D:\file.emz”, FileMode.Open, FileAccess.Read)
Using MyEMZArchive As Gzip.GzipArchive = New Gzip.GzipArchive(MyFileStream)
MyEMZArchive.Open().CopyTo(MyMemoryStream)
MsgBox(MyEMZArchive.Name)
MyEMZArchive.Save(“D:\out.zzz”)
End Using
End Using

Name is always empty string, while it should be image.emf
And save: InvalidOperationException: Source has not been supplied.
Using this file:
sample.zip (59.5 KB)

Here:

Could not find the difference between Extract and Save
And also, how to loop through items inside Gzip archive with For Each?
Like what we do with Zip:
For Each MyEntry As ArchiveEntry In MyArchive.Entries

gzip archive has no entries. It just compresses. The Name of original file is not always kept in header.
Extraction to file in your case:

using (FileStream gzipFile = File.Open(@"D:\Work\Agnostic1.emz", FileMode.Open))
{
    using (var archive = new GzipArchive(gzipFile, true))
    {
        if (!string.IsNullOrEmpty(archive.Name))
        {
            using (FileStream output = new FileStream( archive.Name, FileMode.Create))
               archive.Extract(output);                        
        }
        else 
       {
             using (FileStream output = new FileStream("output", FileMode.Create))
               archive.Extract(output);                        
        }
    }
}

Notice second parameter in this constructor.

1 Like

Hello and thanks, yep I found Gzip can only have 1 entry and possible to extract using .NET 4.0 Compression.GZipStream as well.
Anyway, no idea why WinRAR will find its original file name while Aspose won’t?

Aspose.Zip does if you set parseHeader = true in GzipArchive constructor. sourceStream must be seekable in this case.

Sorry, but .emz is Gzip, while TGZ is tar.gz, TGZ is a collection of Gzip files inside a Tar archive.
So emz is Gzip, TGZ is Tar, am I right? Still no way to distinguish? :slight_smile:

Do you mean distinguish *.tar.gz from *.gz?

Yep, possible? :slight_smile:

Currently no.
Aspose.Zip detects gzip format basing on signature bytes characteristic for such a format. It order to detect if it is compressed tar, or cpio, which is similar to tar, or just any other file we have to extract gzip first and then look into extracted bytes. Format detection feature does not extract.

My suggestion is to extract gzip and then detect if extracted file is tar archive or not.