Hello Aspose-Team,
in MS Word’s character formatting settings it’s possible to choose “smallcaps” or “all caps” (don’t know if that’s the exact English description in the dialogs).
Before 2017 in Germany “ß” in a capitalized word was transformed into “SS”, e.g. “Straße” would become “STRASSE”.
Unicode 5.1.0 already introduced a capital ß (U+1E9E “ẞ” LATIN CAPITAL LETTER SHARP S) in 2008.
Reference: ß - Wikipedia
MS Word respects this “capital letter sharp S” when a text is formatted as smallcaps or all caps. Aspose.Words on the other hand replaces “ß” with “SS” as it was perfectly okay before 2008 or even 2017 (if you consider the adoption in the standard German orthography).
As I couldn’t find any way to change the output conversion in Aspose.Words, here’s a suggestion :
- Adapt the use of a capital S in Aspose.Words for small-caps / all-caps,
- Provide a SaveOptions-setting for the user to choose the old way or new form (default like current word versions?)
- To maintain maximum compatibility with older fonts that might be in use: Take care if the used font does indeed include a character U+1E9E “ẞ” LATIN CAPITAL LETTER SHARP S, and fallback to the previous “SS”-version, if it doesn’t.
- I don’t know if it’s relevant, but maybe the way of the conversion might depend on the language settings of the text (don’t know about Austria for example …),
- I specifically require this for saving as PDF, but I guess other formats will be affected, too.
Why is this important:
- it reflects MS Words behaviour,
- it’s the standard German orthography since 2017,
- the conversion is especially required when dynamically replacing text programmatically, that might contain “ß”. (That’s how we stumbled upon it in our rendering software.)
Sample-Code:
void testCapitalsWithSZ() throws Exception {
Document doc = new Document("C:\\Temp\\in.docx");
doc.save("C:\\Temp\\out-actual.pdf", SaveFormat.PDF);
}
Sample-Documents:
in.docx (11.7 KB)
out-actual.pdf (28.1 KB)
out-expected.pdf (53.7 KB)
Thank you for considering!
Kind regards
Dirk Steinkamp