Hi,
we are exporting +1mio emails from a number of PST files (shared folders that are being migrated).
After doing the first migration, customer found out that some messages was missing due to special permissions was set on these folders.
Hence we now have to extract again.
The issue is, that we only want to migrate the new mails, not the ones we already processed.
What we did so far using Aspose is extracting PersonalStorage and then iterate folder content from the root folder recursively.
Digging into the hierarchy we extract folderinfo, message info and MapiMessage, but was we thought this was only supposed to run once, and since we found out that MessageId+ConversationIndex is not unique, we are not sure how to determine what new subfolders and mails were found since the last extract of the same PST files.
Are there some absolutely unique key that can be used for this?
Thanx
Anders
@AndersRask
Can you please clarify what unique identifiers you have used so far to track processed emails, and if you have any specific requirements for identifying new emails?
MessageId+ConversationIndex and saving .msg with a new guid
Hello @AndersRask ,
Can you please provide a code sample for investigation?
Thank you.
Yes sure. The code below is just to analyze the properties available. The extract code itself is build over the same idea, but also save msg file on disk and store the metadata in a database.
The goal is to uniquely identify the existing mails in the database, so they can be marked as already migrated. Analysing the messageid and conversationindex shows that they are not unique, both for the obvious reason that they can be copied to different folders for archiving, but also where we can see ClientSubmitTime or DeliveryTime are different, so it looks like sometimes “templates” are being used by sender that have the same messageid several times.
So to boil it down: do we have other tools than ConversationIndex+MessageId to compare PST files to find out what mails were added since last export?
using Aspose.Email.Mapi;
using Aspose.Email.Storage.Pst;
using System.Text;
using CommandLine;
using Serilog;
class Program
{
public class Options
{
[Option('f', "file", Required = false, HelpText = "Path to PST file to analyze")]
public string? File { get; set; }
[Option('m', "mode", Required = false, HelpText = "If All - analyse all PST files")]
public string? Mode { get; set; }
}
static void Main(string[] args)
{
string pathToPST = string.Empty;
bool all = false;
int totalcount = 0;
Aspose.Email.Storage.Pst.FolderInfo? folderInfo = null;
PersonalStorage? personalStorage = null;
Parser.Default.ParseArguments<Options>(args)
.WithParsed<Options>(o =>
{
if (!string.IsNullOrEmpty(o.File))
{
pathToPST = o.File;
}
if (!string.IsNullOrEmpty(o.Mode) && o.Mode.ToLower() == "all")
{
all = true;
}
});
string loglocation = "Z:\\Logs";
string filename = Path.GetFileName(pathToPST);
string logFilename = "PST_" + filename + ".log";
if ( all)
{
logFilename = "PST_All.log";
}
string logFile = Path.Combine(loglocation, logFilename);
Log.Logger = new LoggerConfiguration()
.MinimumLevel.Debug()
.WriteTo.Console()
.WriteTo.File(logFile)
.CreateLogger();
Aspose.Email.License license = new Aspose.Email.License();
license.SetLicense("Aspose.Total.NET.lic");
try
{
if ( all )
{
string[] files = Directory.GetFiles(@"Z:\", "*.pst");
foreach (string file in files)
{
// Load the Outlook PST file
personalStorage = PersonalStorage.FromFile(file);
// Get the folders and messages information
folderInfo = personalStorage.RootFolder;
// Call the recursive method to display the folder contents
AnalyzeFolder(personalStorage, folderInfo);
}
}
else
{
// Load the Outlook PST file
personalStorage = PersonalStorage.FromFile(pathToPST);
// Get the folders and messages information
folderInfo = personalStorage.RootFolder;
AnalyzeFolder(personalStorage, folderInfo);
}
Log.Information($"Total messages in PST: {totalcount}");
Log.CloseAndFlush();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}
/// <summary>
/// This is a recursive method to display contents of a folder
/// </summary>
/// <param name="folderInfo"></param>
/// <param name="pst"></param>
private static void AnalyzeFolder(PersonalStorage personalStorage, Aspose.Email.Storage.Pst.FolderInfo folderInfo)
{
Aspose.Email.Storage.Pst.MessageInfoCollection messageInfoCollection = folderInfo.GetContents();
foreach (Aspose.Email.Storage.Pst.MessageInfo messageInfo in messageInfoCollection)
{
GetMailDetails(personalStorage, messageInfo, folderInfo);
}
// Call this method recursively for each subfolder
if (folderInfo.HasSubFolders == true)
{
foreach (Aspose.Email.Storage.Pst.FolderInfo subfolderInfo in folderInfo.GetSubFolders())
{
AnalyzeFolder(personalStorage, subfolderInfo);
}
}
}
private static void GetMailDetails(PersonalStorage personalStorage, Aspose.Email.Storage.Pst.MessageInfo messageInfo, Aspose.Email.Storage.Pst.FolderInfo folderInfo)
{
// Extract the MapiMessage from the PST message
MapiMessage mapiMessage = personalStorage.ExtractMessage(messageInfo);
// get conversation index
MapiProperty conversationIdProperty = mapiMessage.Properties[KnownPropertyList.ConversationIndex];
string conversationId = String.Empty;
if (conversationIdProperty != null)
{
conversationId = GetConversationIndexAsString(conversationIdProperty.Data);
}
// Extract the mail details from the MapiMessage
string subject = mapiMessage.Subject;
string sender = mapiMessage.SenderEmailAddress;
string itemId = mapiMessage.ItemId;
string messageId = mapiMessage.InternetMessageId;
string transportMessageHeaders = mapiMessage.TransportMessageHeaders;
long senderEntryId = mapiMessage.Properties[KnownPropertyList.SenderEntryId].GetLong();
DateTime clientSubmitTime = mapiMessage.ClientSubmitTime;
DateTime deliveryTime = mapiMessage.DeliveryTime;
// Print the mail details
Console.WriteLine($"Subject: {subject}");
Console.WriteLine($"Sender: {sender}");
Console.WriteLine($"Delivery Time: {deliveryTime}");
Console.WriteLine($"Submit Time: {clientSubmitTime}");
Console.WriteLine($"Entry ID: {itemId}");
Console.WriteLine($"Message ID: {messageId}");
Console.WriteLine($"Conversation ID: {conversationId}");
Console.WriteLine($"Transport Message Headers: {transportMessageHeaders}");
Console.WriteLine($"Sender Entry ID: {senderEntryId}");
}
private static string GetConversationIndexAsString(byte[] bytes)
{
StringBuilder sb = new StringBuilder();
foreach (byte b in bytes)
{
sb.Append(b.ToString("X2"));
}
return sb.ToString();
}
}
Hello @AndersRask,
You can use the MessageInfo.EntryId
property as a unique identifier for emails within the same PST file. The EntryId is guaranteed to be unique for items in a PST, as it represents the internal identifier assigned by the PST structure.
foreach (Aspose.Email.Storage.Pst.MessageInfo messageInfo in messageInfoCollection)
{
string entryId = messageInfo.EntryId; // Unique identifier within the PST file
Console.WriteLine($"Entry ID: {entryId}");
// Process the email
MapiMessage mapiMessage = mPst.ExtractMessage(messageInfo);
}
Thanx I’ll try that. I looked at the mapiMessage.Properties[KnownPropertyList.SenderEntryId].GetLong(); but that wasn’t always present.
@AndersRask ,
Please spare a minute to share your feedback.
Thank you.