PDF: Conversion of sharp s ("ß") to double SS is different from MS Word conversion to capital letter sharp s

Hello Aspose-Team,

in MS Word’s character formatting settings it’s possible to choose “smallcaps” or “all caps” (don’t know if that’s the exact English description in the dialogs).

Before 2017 in Germany “ß” in a capitalized word was transformed into “SS”, e.g. “Straße” would become “STRASSE”.
Unicode 5.1.0 already introduced a capital ß (U+1E9E “ẞ” LATIN CAPITAL LETTER SHARP S) in 2008.
Reference: ß - Wikipedia

MS Word respects this “capital letter sharp S” when a text is formatted as smallcaps or all caps. Aspose.Words on the other hand replaces “ß” with “SS” as it was perfectly okay before 2008 or even 2017 (if you consider the adoption in the standard German orthography).

As I couldn’t find any way to change the output conversion in Aspose.Words, here’s a suggestion :slight_smile::

  • Adapt the use of a capital S in Aspose.Words for small-caps / all-caps,
  • Provide a SaveOptions-setting for the user to choose the old way or new form (default like current word versions?)
  • To maintain maximum compatibility with older fonts that might be in use: Take care if the used font does indeed include a character U+1E9E “ẞ” LATIN CAPITAL LETTER SHARP S, and fallback to the previous “SS”-version, if it doesn’t.
  • I don’t know if it’s relevant, but maybe the way of the conversion might depend on the language settings of the text (don’t know about Austria for example …),
  • I specifically require this for saving as PDF, but I guess other formats will be affected, too.

Why is this important:

  • it reflects MS Words behaviour,
  • it’s the standard German orthography since 2017,
  • the conversion is especially required when dynamically replacing text programmatically, that might contain “ß”. (That’s how we stumbled upon it in our rendering software.)

Sample-Code:

	void testCapitalsWithSZ() throws Exception {
		Document doc = new Document("C:\\Temp\\in.docx");
		doc.save("C:\\Temp\\out-actual.pdf", SaveFormat.PDF);
	}

Sample-Documents:
in.docx (11.7 KB)
out-actual.pdf (28.1 KB)
out-expected.pdf (53.7 KB)

Thank you for considering!

Kind regards
Dirk Steinkamp

@DirkSteinkamp Thank you for reporting the problem to us and your detailed description, this is much appreciated. For a sake of correction the issue has been logged as WORDSJAVA-2744. We will keep you informed and let you know once it is resolved.
By the way, the problem is specific to Java version of Aspose.Words and does not occur in .NET version. Here is the output produced by Aspose.Words for .NET: out_net.pdf (27.7 KB)

1 Like

@alexey.noskov: thanks for the prompt reply, and the positive outlook! :slight_smile:

By the way: I really appreciate all the constructive and prompt answers to my support requests so far! Great service! :+1:

Thanks for fixing this in 23.1

@DirkSteinkamp For some reason the automatic notification about the fix was not posted in the thread. Please accept our apologies for that.

1 Like