How does MapiMessage pick the code page to use to decode string properties?
Specifically for .NET, will it use an Encoding instance that throws an Exception if it is unable to decode a character or does it use one that will just replace characters that cannot be decoded?
The MapiMessage class in .NET utilizes the code page specified in the MAPI property to decode string properties. If the code page is not explicitly defined, it defaults to the system’s ANSI code page.
When decoding, the Encoding instance typically replaces characters that cannot be decoded with a replacement character, such as a question mark. This behavior prevents exceptions during the decoding process, facilitating smoother handling of string properties that may contain invalid characters.
For specific encoding scenarios or if you require more control over the decoding process, consider implementing a custom Encoding instance tailored to your requirements.
The MapiMessage class determines the code page for decoding string properties primarily from the message itself.
The primary source for the code page is the PR_INTERNET_CPID MAPI property, which stores the Internet code page identifier used by the client that created the message. This property maps directly to a specific encoding.
If this property is missing or invalid, Aspose.Email will fall back to other common encodings, typically trying to use:
ANSI code page.
UTF-8, which is common in messages.
The lib’s logic is designed to infer the most likely correct encoding based on the available metadata within the MAPI message structure.
Aspose.Email uses the .NET default mechanism, with the Unicode Replacement Character (‘�’ U+FFFD).
This means:
Any byte sequence that cannot be mapped to a character in the target code page will be silently replaced with the replacement character.
The decoding process will not halt, and no Exception will be thrown.