Hi,
We are translating Word documents into foreign languages. We use Aspose to read the document, put all sentences into an XML file, translate that, and then merge the translation back by iterating the original document and replacing the original sentences.
This works well for Western documents that use ANSI fonts and Western formatting rules but produces incorrect results when going to e.g. Japanese.
Word has the ability to switch on specific formatting options for non-Western documents by using the RTF control word \stshfdbchN. In essence, it declares the document to be in a certain language and use the formatting rules of that language. Word uses the properties of the specified font \fN to derive the formatting options.
The attached files “test-without-japanese-default.rtf” and “test-with-japanese-default.rtf” are identical with the exception of \stshfdbch.
In test-without-japanese-default.rtf, \stshfdbch0 points to \f0 which is Times New Roman. Western rules say that spaces between words (or Japanese characters) allow lines to be wrapped. This line wrapping is what we currently get with Aspose and it is incorrect. Note especially the wrapping between the 2 Japanese sentences in the middle of the document.
In test-with-japanese-default.rtf, \stshfdbch14 points to \f14 which is the Japanese PMingLiU font. Word now uses formatting options of Japanese in which spaces between words/characters do not allow line wrapping to occur; all Japanese characters and embedded (“untranslated”) Western text flows on a line until the line is full. A line can be wrapped between any two Japanese characters or between Western words at space boundaries. This is the desired behavior.
I have looked in the API and forums for a way to achieve what \stshfdbchN does to no avail. Please add support for it.
Thanks