Hello Aspose Support Team,
I am using Aspose.Email to read .msg
files. I noticed that in some emails, the disclaimer text (inserted by our mail server) appears twice in the HtmlBody
.
When I checked the raw MAPI property PR_HTML (via MFCMAPI), I found that the duplicate text is actually present inside MSO conditional comments (<!--[if mso]> ... <![endif]-->
) as well as in the normal HTML body.
For example:
<!--[if mso]>
<p style="color:red;">Security Disclaimer</p>
<![endif]-->
<p style="color:red;">Security Disclaimer</p>
So, when I use Aspose to read the email, I also see the duplicated disclaimer.
Here is the code I used:
string body = mailMessage
.GetHtmlBodyText(false)
.Replace("<", "<")
.Replace(">", ">")
.Replace("\n", "<br />")
.Replace("\r", "");
My questions are:
- Does Aspose.Email provide any built-in functionality to normalize or clean up the HtmlBody, for example by removing MSO conditional comments?
- If not, what is the recommended approach to handle this scenario? Should I post-process the HTML myself (e.g., with regex or an HTML parser), or does Aspose have a utility for this?
- Is this considered expected behavior (Aspose returning raw PR_HTML), or should I report it as a bug/feature request?
Thank you for your guidance.
Best regards,
Nghia
Hi Support Team,
Thank you very much for your feedback.
I understand your suggested solution, but I still have a problem:
I am using the following code to get the body:
body = mailMessage.GetHtmlBodyText(false)
.Replace("<", "<")
.Replace(">", ">")
.Replace("\n", "<br />")
.Replace("\r", "");
When I call mailMessage.GetHtmlBodyText(false)
, the returned body already contains the disclaimer text duplicated.
My question is:
Is there a way to remove the parts <!--[if mso] ... <![endif]-->
before mailMessage.GetHtmlBodyText(false)
parses the content?
The reason I am asking is that I really want to continue using mailMessage.GetHtmlBodyText(false)
because it already fits our requirements. If I change the logic to extract the body differently, there is a risk that the output will not match the current behavior of GetHtmlBodyText(false)
.
Do you have any solution or recommended approach for this?
I hope this makes sense, and I look forward to your guidance.
Best regards,
Nghia
Hello @nguyen.xuan.nghia,
Thank you for clarifying your scenario.
Currently, Aspose.Email doesn’t provide a built‑in way to remove <!--[if mso]> ... <![endif]-->
conditional comments before mailMessage.GetHtmlBodyText(false)
processes the content. This method works directly on the raw PR_HTML property, so MSO conditional comments are included in the result.
The recommended approach is to clean the HTML body before calling GetHtmlBodyText(false)
. For example:
// Remove MSO conditional comments from the HTML body
mailMessage.HtmlBody = Regex.Replace(
mailMessage.HtmlBody,
@"<!--\[if mso\].*?<!\[endif\]-->",
string.Empty,
RegexOptions.Singleline);
// Now call GetHtmlBodyText(false) as usual
string body = mailMessage.GetHtmlBodyText(false)
.Replace("<", "<")
.Replace(">", ">")
.Replace("\n", "<br />")
.Replace("\r", "");
This way, you preserve the behavior of GetHtmlBodyText(false)
while removing duplicate disclaimers caused by MSO conditional blocks.
Please note this behavior is expected. If you consider this a common scenario, you may also submit a request so our team can evaluate adding an option to filter out MSO conditional comments directly within the lib.
Hi Support Team,
Thank you very much for your response. It really helped me solve my problem.
Best regards,
Nghia