MapiMessage.BodyHtml and MAPI property PR_HTML differences

occorled · June 7, 2012, 4:37pm

I noticed there is no MapiPropertyTag.PR_HTML, so we cannot access the binary MAPI property containing the HTML message data. So I had to use the MapiMessage.BodyHtml member property.

But I found some messages where the BodyHtml string has characters that are not properly escaped. Please see all the attachments. Notice the PR_HTML property has the correct data, but BodyHtml does not. I don’t know why these characters have been rendered to their non-escaped form when copied to BodyHtml.

This is a problem because when I save out an HTML version of the message, those characters are not displayed properly because they are not escaped.

A list of characters that need to be escaped for HTML, can be found here:
http://tntluoma.com/sidebars/codes/

SHMalik · June 8, 2012, 11:48am

Hi,

I would like to inform that PR_HTML property tag is not available.

For further investigation we need sample code and MSG file causing this issue. As you remember that its quite necessary to provide evidence for logging ticket in our bug tracking system.

Best Regards

occorled · June 11, 2012, 8:44am

Here you go, SH. /attached

The original message had some sensitive attachments, so I forwarded the message to myself and removed those attachments; the BodyHTML is the same though.

Like I showed in the screenshots, just load this message up in a MapiMessage and inspect the BodyHTML. You will see that it has unescaped HTML characters. Also, compare to PR_HTML, which has escaped characters.

SHMalik · June 12, 2012, 8:47am

Hi,

Please accept my sincere apology for late reply.

I have tried in many ways to re-produce the issue but failed. I tried following:

Loaded this message in MapiMessage and observed its html body in text visualizer, its shows readable text.
Then I copied this html body in a .mht file and opened it in different browsers, again it shows readable characters.
I used outlook spy and observed the properties, it shows ’ which is also equivalent to apostrophe, that is why I could not regenerate the issue.
I converted the message into html and observed the target line, its completely readable.
I opened message directly into outlook express, and found all the readable characters.

Now I request you to please guide me to generate the scenario where garbage characters can be seen in the browser so that I may raise a ticket.

I regret any inconvenience caused to you.

Best Regards

occorled · June 13, 2012, 8:58am

I think the bottom line is still this:
If we can access MapiPropertyTag.PR_HTML, I believe there is no problem at all, because the characters in that property are already correct and contain the escaped versions of characters (’ and — and all the others found here: http://tntluoma.com/sidebars/codes/). I don’t know why we can’t access that property directly like we can access most of the other properties.

Please look carefully at MapiMessage.BodyHTML characters. If you copy and paste the value out to text file, you will find this text line inside “Team – This one has a minor change, it wasn’t updated in the future”. That is bad, because if you write that out with standard encoding it will come out as “Team â€“ This one has a minor change, it wasnâ€™t updated in the future”. Now…

Please look carefully in OutlookSpy at PR_HTML property. If you copy and paste the value out, you will find this text line inside “Team — This one has a minor change, it wasn’t updated in the future”. This is good, because you can write it out in standard encoding and it will look like correct.

So, PR_HTML characters are properly escaped. MapiMessage.BodyHtml characters are not escaped properly. This data has been changed when someone copied the data from PR_HTML to BodyHtml; it has lost the escaping.

Here is some simple code to generate the problem:

// load message that I provided
MapiMessage msg = MapiMessage.FromFile(filepath);

// write msg.BodyHtml out to a text file, and then view that .html file. you will see bad characters
//using (TextWriter writer = new StreamWriter(“test.html”, false))
using (StreamWriter writer = new StreamWriter(“test.html”, false))
{
writer.Write(msg.BodyHtml);
}

// write msg.BodyHtml out to a file with Unicode encoding, and you will see it looks ok, because the Unicode encoding can handle the special characters when they are not escaped
using (FileStream writer = new FileStream(“test_unicode.html”, FileMode.Create, FileAccess.Write))
{
byte[] bytes = Encoding.Unicode.GetBytes(msg.BodyHtml);
writer.Write(bytes, 0, bytes.Length);
}

Attached you will find the two files that this code generates. We shouldn’t have to write out the entire BodyHtml in Unicode format (which takes more disk space). We should be able to write out the PR_HTML data in standard encoding with the escaped characters…

occorled · June 13, 2012, 9:04am

Also, we are aware of this functionality, but we are not interested in it:
MailMessage mail = MapiMessageToMailMessage(msg);
if (mail != null)
{
mail.Save(filename, MessageFormat.Mht);
}

We cannot use this because we want to save out the PR_HTML data plus some additional HTML that we will generate.

occorled · June 13, 2012, 10:01am

The title of this thread explains the problem the best: “MapiMessage.BodyHtml and MAPI property PR_HTML differences”.

There shouldn’t be any differences. If there are differences, then we should be able to get the original PR_HTML data.

SHMalik · June 14, 2012, 7:49am

Hi,

Thank you very much for knowledge sharing and detail explanation.

I have logged ticket NETWORKNET-33315 in our bug tracking system. I have requested the developers to analyze the possibility of providing access to the PR_HTML property. As soon as some feedback is received, it will be notified to you.

Best Regards

occorled · June 15, 2012, 2:39pm

SH,

I figured out I can use the standard MAPI identifier code with the data type to generate the tag long value.

PR_HTML = 0x1013

PT_BINARY = 0x0102

Aspose.PR_HTML = 0x0000000010130102

Aspose.PR_FLAG_STATUS = 0x0000000010900003

Aspose.PR_LAST_VERB_EXECUTION_TIME = 0x0000000010820040

Sorry, I didn’t think of this earlier. Users can just make their own long constant until Aspose.Email api has equivalent. Even in OutlookSpy, this value is shown in “Tag num” field in top right…

PidTagFlagStatus Canonical Property | Microsoft Learn

occorled · June 15, 2012, 4:00pm

Well… that works for the properties that are in MapiMessage with no equivalent MapiPropertyTag.

But, PR_HTML (0x0000000010130102) is simply not stored inside
MapiMessage property collection, when we call
PersonalStorage.ExtractMessage() to get MapiMessage. So I can’t get it

SHMalik · June 16, 2012, 2:43pm

Hi,

Thank you for your precious comments.

I have also checked that we can access PR_LAST_VERB_EXECUTION_TIME, PR_FLAG_STATUS and PR_FLAG_ICON values. Similarly as per your comments, I could not find the PR_BODY_HTML property in the MapiMessage properties collection.

I have conveyed all the comments related to above properties and PR_BODY_HTML to the developer. Any comments by him will be notified to you immediately.

Best Regards

aspose.notifier · July 30, 2012, 5:07pm

The issues you have found earlier (filed as NETWORKNET-33315) have been fixed in this update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.