I can't not extract messages in some mbox files

mbox.zip (18.2 KB)
mbox.zip (18.2 KB)

Can you figure out the problem?
I used same sample code that your website provides

@HM_Company,

Can you please share the source code that you have used on your end along with issue details. Moreover, you have also shared an issue related to OST files extraction. Please do verify before sharing with us that if the issue related to MBOX file is of same nature as that of OST.

Here is my code.

private void button5_Click(object sender, EventArgs e)
    {
        OpenFileDialog docBrowse1 = new OpenFileDialog();
        if (docBrowse1.ShowDialog() == DialogResult.OK)
        {
            string path = docBrowse1.FileName;
            string folderpath = "\\\\?\\" + Path.GetDirectoryName(path) + "\\Extracted→" + Path.GetFileName(path);
            Directory.CreateDirectory(folderpath);

            MboxrdStorageReader reader = new MboxrdStorageReader(path, true);
            //Actually, this file have messages but it said "0"
            MessageBox.Show("Total number of messages in Mbox file: " + reader.GetTotalItemsCount(), "dgSearch");

            // Start reading messages
            Aspose.Email.MailMessage message = reader.ReadNextMessage();

            // Read all messages in a loop
            while (message != null)
            {

                // Save this message in EML or MSG format
                message.Save(folderpath + "\\" + GetFileName(message.Subject, message.Date) + ".eml", Aspose.Email.SaveOptions.DefaultEml);

                // Get the next message
                message = reader.ReadNextMessage();
            }
            // Close the streams
            reader.Dispose();



        }
        MessageBox.Show("Complete", "dgSearch");

    }
    private static string GetFileName(string subject, DateTime time)
    {
        Random r = new Random();
        string fileName = "";

        if (subject == null || subject.Length == 0)
        {
             fileName = "NoSubject";
            return fileName + "_" + r.Next(1, 1000); 
        }
        else
        {
            if(time != null)
                fileName = time.ToString("yyyy-MM-dd HHmmss") + "_";

            for (int i = 0; i < subject.Length; i++)
            {
                if (subject[i] > 31 && subject[i] < 127)
                {
                    fileName += subject[i];
                }
            }
            
            fileName = fileName.Replace("\\", "_");
            fileName = fileName.Replace("/", "_");
            fileName = fileName.Replace(":", "_");
            fileName = fileName.Replace("*", "_");
            fileName = fileName.Replace("?", "_");
            fileName = fileName.Replace("\"", "_");
            fileName = fileName.Replace("<", "_");
            fileName = fileName.Replace(">", "_");
            fileName = fileName.Replace("|", "_");
            fileName = fileName.Replace("\n", "");
            fileName = fileName.Replace("\r", "");
            fileName = fileName.Replace("\t", "");
            fileName = fileName.Replace("\u000e", "");
            fileName = Regex.Replace(fileName, "[ďż˝*?<>/:@,\\.\";'\\\\đź”´]", "_"); //Your code added

            return fileName + "_" + r.Next(1, 100000);//; 
        }
    }

The point is that reader.GetTotalItemsCount() say “0 messages in Mbox”.

I think it couldn’t read mbox file properly

I will wait your answer Thank you.

@HM_Company,

I have worked with the sample files shared by you and it seems to be an issue while reading MBOX file contents. A ticket with ID EMAILNET-39858 has been created in our issue tracking system to further investigate and resolve the issue. This thread has been linked with the issue so that you may be notified once the issue will be fixed.

1 Like

The issues you have found earlier (filed as EMAILNET-39858) have been fixed in this update.

I tested the same file in version 20.6.0, but the problem was not fixed at all. Are you sure it’s been fixed in this update?

@HM_Company,

Actually, the sample mbox files that you provided have been created by the Eudora mail client.
This format called as MBOXO is a modification of the MBOX format and was not supported by Aspose.Email.

So, we have added support for MBOXO file format used by Eudora email client.

An MboxoStorageReader class has been added to the API:

MboxoStorageReader reader = new MboxoStorageReader(fileName, true);

An MboxStorageReader.CreateReader factory method has been added for more convenient use.
It automatically detects a modification of mbox format and creates a corresponding reader instance.

Thus in order to solve the issue described, the code sample should be as following:

//Use the factory method to get the right instance of the reader.
MboxStorageReader reader = MboxStorageReader.CreateReader(path, true);

Console.WriteLine("Total number of messages in Mbox file: " + reader.GetTotalItemsCount(), "dgSearch");

// Start reading messages
Aspose.Email.MailMessage message = reader.ReadNextMessage();

// Read all messages in a loop
while (message != null)
{
    // Save this message in EML or MSG format
    message.Save(folderpath + "\\" + GetFileName(message.Subject, message.Date) + ".eml", Aspose.Email.SaveOptions.DefaultEml);

    // Get the next message
    message = reader.ReadNextMessage();
}

// Close the streams
reader.Dispose();