Detect file types from stream

I need to be able to detect what office file a stream contains doc(x), pdf, xls(x), ppt(x) - is this possible?

Hi Derek,

Thanks for your inquiry. Yes, you can detect the file format of MS Word documents by using Aspose.Words. Please use the following code snippet to detect the file format of MS Word documents.

// Load the document without a file extension into a stream and use the DetectFileFormat method to detect it's format.
FileStream docStream = File.OpenRead(MyDir + @"Document.FileWithoutExtension");
// The file format of this document is actually ".doc"

FileFormatInfo info = FileFormatUtil.DetectFileFormat(docStream);
Console.WriteLine("The document format is: " + FileFormatUtil.LoadFormatToExtension(info.LoadFormat));
Console.WriteLine("Document is encrypted: " + info.IsEncrypted);
Console.WriteLine("Document has a digital signature: " + info.HasDigitalSignature);

With FileFormatInfo, you can detect only file formats listed in the following link: http://www.aspose.com/docs/display/wordsnet/LoadFormat+Enumeration

Please read the following documentation links for your reference:

I am moving this thread to the Aspose.Total forum. My colleagues from the Aspose.PDF, Aspose.Slides, and Aspose.Cells teams will reply to you shortly about the detection of file formats (pdf, xls(x), ppt(x)).

Hi,

I am from Aspose.Cells team and would like to help you regarding detect Excel spreadsheet file formats. I think you may use CellsHelper.DetectFileFormat() static method to get file format type. See the sample code below:

Sample code:
FileStream stream = File.OpenRead(@"e:\test2\Book1.xlsx");
byte[] buffer = new byte[stream.Length];
stream.Read(buffer, 0, buffer.Length);
stream.Position = 0;

MessageBox.Show(CellsHelper.DetectFileFormat(stream).ToString());
Workbook wb = new Workbook(stream);

Thank you.

Hi Derek,


I am representing Aspose.Slides.

I like to share that Aspose.Slides offers you the feature for identifying the presentation format. Please use the following sample code to serve the purpose. Please share if I may help you further in this regard.

PresentationEx pres = new PresentationEx(PresentationStream);

SourceFormatEx format = pres.SourceFormat;

Many Thanks,

Many thanks it would be useful for a product like Aspose.Total to include a single mechanism to detect all file types from a stream. This would also be a differentiator driving decisions to purchase the total product rather than individual products. Can you put this on a “would like to have” list?

Hi Derek,

Thanks for contacting support.

I am a representative from the Aspose.Pdf team. The simplest way to determine if the source file in a stream is a PDF is to try initializing an Aspose.Pdf.Document object in a Try-Catch block, and if you do not encounter any exception while reading the source file, the input document is in PDF format. Please take a look at the following code snippet.

[C#]

FileStream inputfile = new FileStream(@"C:\pdftest\New Microsoft PowerPoint Presentation.pptx", FileMode.Open);

try
{
    Document doc = new Document(inputfile);
    Console.WriteLine("Input document is proper PDF");
}
catch (Aspose.Pdf.Exceptions.InvalidPdfFileFormatException ex)
{
    Console.WriteLine("Source file is not PDF. " + ex.Message);
}

inputfile.Close();

Besides this, you may consider determining the MIME type of the file. You may refer to the discussion over this link.

dl129302:
Many thanks it would be useful for a product like Aspose.Total to include a single mechanism to detect all file types from a stream. This would also be a differentiator driving decisions to purchase the total product rather than individual products. Can you put this on a "would like to have" list?
Hi Derek,

Aspose.Total is a set/suite/bundle of particular products targeted towards particular platform. However for individual file type, we have separate products (that are packaged inside Total package) and you need to use specific product to deal with individual file format.

The mime type for an excel document is being returned as application/octet-stream. It is correctly recognised by Aspose.Cells. I cannot rely on the mimetype being correct even though the documents were created by the application provider using Aspose.

The mime type for an excel document is being returned as application/octet-stream. It is correctly recognised by Aspose.Words. I cannot rely on the mimetype being correct even though the documents were created by the application provider using Aspose.

PresentationEx is not a type in the Aspose.Slides version I am working with.

Hi,


I would suggest you to please try using latest version of Aspose.Slides for .NET 6.9.0 and import namespace Aspose.Slides.Pptx in your sample code. If there is still an issue then please share with us.

Many Thanks,