Feature request: allow setting the document default language

Hi,

We are translating Word documents into foreign languages. We use Aspose to read the document, put all sentences into an XML file, translate that, and then merge the translation back by iterating the original document and replacing the original sentences.

This works well for Western documents that use ANSI fonts and Western formatting rules but produces incorrect results when going to e.g. Japanese.

Word has the ability to switch on specific formatting options for non-Western documents by using the RTF control word \stshfdbchN. In essence, it declares the document to be in a certain language and use the formatting rules of that language. Word uses the properties of the specified font \fN to derive the formatting options.

The attached files “test-without-japanese-default.rtf” and “test-with-japanese-default.rtf” are identical with the exception of \stshfdbch.

In test-without-japanese-default.rtf, \stshfdbch0 points to \f0 which is Times New Roman. Western rules say that spaces between words (or Japanese characters) allow lines to be wrapped. This line wrapping is what we currently get with Aspose and it is incorrect. Note especially the wrapping between the 2 Japanese sentences in the middle of the document.

In test-with-japanese-default.rtf, \stshfdbch14 points to \f14 which is the Japanese PMingLiU font. Word now uses formatting options of Japanese in which spaces between words/characters do not allow line wrapping to occur; all Japanese characters and embedded (“untranslated”) Western text flows on a line until the line is full. A line can be wrapped between any two Japanese characters or between Western words at space boundaries. This is the desired behavior.

I have looked in the API and forums for a way to achieve what \stshfdbchN does to no avail. Please add support for it.

Thanks

Hi Nils,

Thanks for your request. What you are looking for is CompatibilityOptions.UseFELayout. You can set this option using Aspose.Words. Please see the following simple code:

Document doc = new Document("C:\\Temp\\test-without-japanese-default.rtf");
doc.getCompatibilityOptions().setUseFELayout(true);
doc.save("C:\\Temp\\out.docx");

But unfortunately, this option currently works only for DOCX format. Your request has been linked to the appropriate issue. You will be notified as soon as this option also works for other formats.
Best regards,

Small correction, UseFELayout this is just an option that allow setting such layout in DOCX. In other formats, it seems this option is stored differently (not in CompatibilityOptions). We will further investigate the issue and provide you more information.

Best regards,

The issues you have found earlier (filed as WORDSNET-4079) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Hi Nils,

Thanks for being patient. It is to update you that we have publicly exposed document-wide defaults per WORDSNET-4079:

Font Document.Styles.DefaultFont; 
ParagraphFormat Document.Styles.DefaultParagraphFormat;
Document doc = new Document();
doc.Styles.DefaultFont.NameFarEast = "PMingLiU"; 
Document doc = new Document(); 
doc.Styles.DefaultParagraphFormat.SpaceAfter = 20;

Note that document-wide defaults were introduced in Microsoft Word 2007 and are fully supported in OOXML formats only. Earlier document formats have limited support for default text formatting (only font names can be stored) and have no support for default paragraph formatting (default paragraph formatting is copied to all top level styles when it is not supported in target document format).

Best regards,