Parsing the details via DotNet in efficient manner

Hello @MaazHussain,

To optimize your file parsing, we recommend using the EnumerateMessagesEntryId() method. This approach retrieves only the message entry IDs first, reducing unnecessary processing, and then extracts messages as needed:

public static string GetMailListAsJson(string emailFilePath, string folderEntryId)
{
    var pst = PersonalStorage.FromFile(emailFilePath);
    var targetFolder = pst.GetFolderById(folderEntryId);

    var mailDetailsList = new List<MailListInfo>();

    foreach (var entryId in targetFolder.EnumerateMessagesEntryId())
    {
        try
        {
            var msg = pst.ExtractMessage(entryId);
            var mailDetail = new MailListInfo
            {
                MessageId = entryId.ToString(),
                Subject = msg.Subject ?? "No Subject",
                Sender = msg.SenderEmailAddress ?? msg.SenderName ?? "Unknown Sender",
                Content = (msg.Body?.Length > 100 ? msg.Body.Substring(0, 100) + "..." : msg.Body) ?? "No Content",
                HasAttachments = msg.Attachments?.Count > 0
            };

            mailDetailsList.Add(mailDetail);
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Error processing message: {ex.Message}");
        }
    }

    return JsonConvert.SerializeObject(mailDetailsList);
}

private static MailDetailInfo ProcessFolderForMessage(PersonalStorage pst, FolderInfo folder, string messageId)
{
    foreach (var entryId in folder.EnumerateMessagesEntryId())
    {
        try
        {
            var msg = pst.ExtractMessage(entryId);

            return new MailDetailInfo
            {
                Subject = msg.Subject ?? "No Subject",
                Sender = msg.SenderEmailAddress ?? "Unknown Sender",
                ContentHtml = msg.BodyHtml ?? msg.Body ?? "No Content",
                Recipients = GetRecipients(msg),
                Attachments = GetAttachments(msg)
            };
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Error processing message: {ex.Message}");
        }
    }

    return null;
}

Alternatively, you can use EnumerateMapiMessages(), which eliminates the need for separate message extraction:

public static string GetMailListAsJson(string emailFilePath, string folderEntryId)
{
    var pst = PersonalStorage.FromFile(emailFilePath);
    var targetFolder = pst.GetFolderById(folderEntryId);

    var mailDetailsList = new List<MailListInfo>();

    foreach (var msg in targetFolder.EnumerateMapiMessages())
    {
        try
        {
            var mailDetail = new MailListInfo
            {
                MessageId = msg.EntryIdString,
                Subject = msg.Subject ?? "No Subject",
                Sender = msg.SenderEmailAddress ?? msg.SenderName ?? "Unknown Sender",
                Content = (msg.Body?.Length > 100 ? msg.Body.Substring(0, 100) + "..." : msg.Body) ?? "No Content",
                HasAttachments = msg.Attachments?.Count > 0
            };

            mailDetailsList.Add(mailDetail);
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Error processing message: {ex.Message}");
        }
    }

    return JsonConvert.SerializeObject(mailDetailsList);
}

private static MailDetailInfo ProcessFolderForMessage(PersonalStorage pst, FolderInfo folder, string messageId)
{
    foreach (var msg in folder.EnumerateMapiMessages())
    {
        try
        {
            return new MailDetailInfo
            {
                Subject = msg.Subject ?? "No Subject",
                Sender = msg.SenderEmailAddress ?? "Unknown Sender",
                ContentHtml = msg.BodyHtml ?? msg.Body ?? "No Content",
                Recipients = GetRecipients(msg),
                Attachments = GetAttachments(msg)
            };
        }
        catch (Exception ex)
        {
            Console.WriteLine($"Error processing message: {ex.Message}");
        }
    }

    return null;
}

Additionally, please review our article on working with large PST files.

It’s important to note that performance largely depends on the structure of your messages, such as the size of the HTML body, the presence and size of attachments, and other factors. These optimizations help minimize unnecessary processing and improve efficiency.

Hi @margarita.samodurova I checked the logs and found that

Content = (msg.Body?.Length > 100 ? msg.Body.Substring(0, 100) + "..." : msg.Body) ?? "No Content",

Extracting the message body is the only part which is consuming more time for me, can you help me with any other alternative to fetch

  • Message Body
  • Message Body HTML

@MaazHussain,

Unfortunately, there are no alternative methods for faster extraction of Body / HtmlBody.
Try to use [..] slicing:

string content = msg.BodyType == BodyContentType.PlainText ? msg.Body : msg.BodyHtml;
content = content is { Length: > 100 } ? $"{content[..100]}..." : content ?? "No Content";

This improves readability and may be slightly more performant.

And are you using a trial or a licensed version?
The trial version may work slightly slower because the message body is modified to insert a watermark text.

Hi @margarita.samodurova I have a few queries

  1. Do we support searching mails across folders?
  2. We noticed that while accessing the block of code for the first time it takes some time but when it is accessed again it returns the data instantly irrespective of the file path, is this expected?
    eg: We load
    PersonalStorage.FromFile(emailFilePath);
    Takes 100 milliseconds
    PersonalStorage.FromFile(emailFilePath); (or) PersonalStorage.FromFile(emailFilePath2);
    Takes 0 milliseconds

Can you explain why this is occurring technically?

Hello @MaazHussain,

  1. Yes, Aspose.Email allows searching emails across folders in a PST file.
    You can iterate through folders using MapiQueryBuilder to filter messages based on criteria.

  2. The behavior you observed is not related to Aspose.Email but is likely due to OS or development environment mechanisms.
    It could be some form of caching, such as the operating system’s file system cache, which speeds up subsequent file accesses.

Hi @margarita.samodurova, you have mentioned that we could search across folders, by this do we mean i could give a MailQuery like this RootFolder.EnumerateMessages(query) and i get an consolidated search results of all the subfolders or do we need to manually iterate through each subfolder to get the consolidated search results ?

Hello @Devishree,

You need to iterate through each subfolder to get consolidated search results. RootFolder.EnumerateMessages(query) will only return messages from the specified folder. If you want to search across all subfolders, you need to recursively iterate through each subfolder.

Thank you.

Hi @margarita.samodurova, Do we restrict parsing password protected pst files ?

Hello @Devishree,

Password protection in PST files is essentially an Outlook-specific feature, and the data itself is not encrypted. This allows to extract emails without requiring the password.
You can find more details about working with password-protected PST files on our documentation page.

Hi @margarita.samodurova thanks for the clarification, do we have any method to get list of MessageInfo based on the given list of entity Id ? also can i get list of messages present in the current folder based on the conversation id?

Hello @Devishree,

Aspose.Email does not provide a direct method to retrieve a list of MessageInfo objects based on a given list of entity IDs. However, you can achieve this by iterating over the folder’s message collection and filtering messages based on their IDs.
You can iterate through messages in a folder and filter by entity ID:

FolderInfo folderInfo = pst.RootFolder.GetSubFolder("Inbox");
MessageInfoCollection messages = folderInfo.GetContents();

List<string> entityIds = new List<string> { "id1", "id2" }; // Replace with actual IDs
List<MessageInfo> filteredMessages = messages.Where(m => entityIds.Contains(m.EntryIdString)).ToList();

For retrieving messages in the current folder based on a conversation ID, you can refer to our blog article Group Messages from PST by Conversation Threads using C# .NET. This article explains how to use MAPI properties such as PidTagConversationIndex to identify and group messages into conversations.
Additionally, you can check out the ConversationThread sample app in our GitHub repository. This project provides a practical implementation of grouping messages by conversation.

Hi @margarita.samodurova for folder parsing i need to extract mail related folders alone to achieve this can i rely on the ContainerClass property of the PersonalStorage object ? If no how to achieve this?

Hello @Devishree,

Yes, you can rely on the ContainerClass property of the FolderInfo object. Mail-related folders typically have the ContainerClass set to "IPF.Note". You can use this value to filter out mail folders when parsing the PST.

But i could see empty values for certain folders is this expected?
Image File.png (4.9 KB)

Yes, some folders can have an empty ContainerClass value.

Usually, an empty ContainerClass implies “IPF.Note”, meaning the folder is likely a mail folder. However, for greater reliability, you can extract one item from the folder and check its message class (MapiMessage.MessageClass). If it is “IPM.Note”, then the folder contains mail items.

Hi @margarita.samodurova how do i differentiate inline images and attachment in MapiAttachment, i could notice for a specific file IsInlineImage property always return false even for inline image

Hello @Devishree,

Could you please provide a sample MSG file where an inline image exists, but the IsInline property returns false? This would help us analyze the issue.

Outlook (2).zip (655.3 KB)
Inbox folder has mail with subject HTML body, this mail has around 12 inline images all are sent as attachments

Hi @margarita.samodurova is there any property that represents total number of thread count for a conversation ?

Thank you for providing pst file. We are investigating this issue and will update you on our findings.