Detect file types from stream

I need to be able to detect what office file a stream contains doc(x), pdf, xls(x), ppt(x) - is this possible?

Hi Derek,


Thanks for your inquiry. Yes, you can detect the file format of MS Word documents by using Aspose.Words. Please use the following code snippet to detect file format of MS Word documents.


// Load the document without a file extension into a
stream and use the DetectFileFormat method to detect it’s format. These are
both times where you might need extract the file format as it’s not visible

FileStream docStream = File.OpenRead(MyDir + "Document.FileWithoutExtension"); // The file format of this document is actually ".doc"

FileFormatInfo info = FileFormatUtil.DetectFileFormat(docStream);

Console.WriteLine("The document format is: " + FileFormatUtil.LoadFormatToExtension(info.LoadFormat));

Console.WriteLine("Document is encrypted: " + info.IsEncrypted);

Console.WriteLine("Document has a digital signature: " + info.HasDigitalSignature);


With FileFormatInfo, you can detect only file formats listed in the following link:

Please read following documentation links for your kind reference.
http://www.aspose.com/docs/display/wordsnet/How+to++Detect+the+File+Format
http://www.aspose.com/docs/display/wordsnet/FileFormatUtil+Class
http://www.aspose.com/docs/display/wordsnet/FileFormatInfo+Class

I am moving this thread to Aspose.Total forum. My colleagues from Aspose.PDF, Aspose.Slides and Aspose.Cells team will reply you shortly about the detection of file format (pdf, xls(x), ppt(x)).


Hi,

I am from Aspose.Cells team and would like to help you regarding detect Excel spreadsheet file formats. I think you may use CellsHelper.DetectFileFormat() static method to get file format type. See the sample code below:

Sample code:
FileStream stream = File.OpenRead(@"e:\test2\Book1.xlsx");
byte[] buffer = new byte[stream.Length];
stream.Read(buffer, 0, buffer.Length);
stream.Position = 0;

MessageBox.Show(CellsHelper.DetectFileFormat(stream).ToString());
Workbook wb = new Workbook(stream);

Thank you.

Hi Derek,


I am representing Aspose.Slides.

I like to share that Aspose.Slides offers you the feature for identifying the presentation format. Please use the following sample code to serve the purpose. Please share if I may help you further in this regard.

PresentationEx pres = new PresentationEx(PresentationStream);

SourceFormatEx format = pres.SourceFormat;

Many Thanks,

Many thanks it would be useful for a product like Aspose.Total to include a single mechanism to detect all file types from a stream. This would also be a differentiator driving decisions to purchase the total product rather than individual products. Can you put this on a “would like to have” list?

Hi Derek,


Thanks for contacting support.

I am a representative from Aspose.Pdf team. The simplest way to determine if the source file in stream is PDF, is to try initializing Aspose.Pdf.Document object in Try-Catch block and if you did not encounter any exception while reading the source file, the input document is in PDF format. Please take a look over following code snippet.

[C#]

FileStream inputfile = new FileStream(@“C:\pdftest\New Microsoft PowerPoint
Presentation.pptx”
, FileMode.Open);<o:p></o:p>

try

{

Document doc = new Document(inputfile);

Console.WriteLine("Input document is proper PDf");

}

catch (Aspose.Pdf.Exceptions.InvalidPdfFileFormatException ex)

{

Console.WriteLine("Source file is not PDF. " + ex.Message);

}

inputfile.Close();

Besides this, you may consider determining MIME type of file. You may refer to the discussion over this link.

dl129302:
Many thanks it would be useful for a product like Aspose.Total to include a single mechanism to detect all file types from a stream. This would also be a differentiator driving decisions to purchase the total product rather than individual products. Can you put this on a "would like to have" list?
Hi Derek,

Aspose.Total is a set/suite/bundle of particular products targeted towards particular platform. However for individual file type, we have separate products (that are packaged inside Total package) and you need to use specific product to deal with individual file format.

The mime type for an excel document is being returned as application/octet-stream. It is correctly recognised by Aspose.Cells. I cannot rely on the mimetype being correct even though the documents were created by the application provider using Aspose.

The mime type for an excel document is being returned as application/octet-stream. It is correctly recognised by Aspose.Words. I cannot rely on the mimetype being correct even though the documents were created by the application provider using Aspose.

PresentationEx is not a type in the Aspose.Slides version I am working with.

Hi,


I would suggest you to please try using latest version of Aspose.Slides for .NET 6.9.0 and import namespace Aspose.Slides.Pptx in your sample code. If there is still an issue then please share with us.

Many Thanks,