Recognize document language(s)

I am looking for a way to recognize/classify the language(s) code in a (Word) document: nl, en, fr, …
Is this possible with Aspose?
The only thing I found is OCR in different languages.

@wfbe

Currently, there is no straightforward method to detect the language of a Word document using the Aspose.Words API. This functionality has been noted as a requirement and is being tracked for future consideration, but as of now, it is not implemented.

However, you can utilize the Font.LocaleId property to get the locale identifier for text runs within the document. Here’s a simple code example demonstrating how to retrieve the locale ID for each run in a Word document:

Document doc = new Document("C:\\Temp\\in1.docx");
foreach (Run run in doc.GetChildNodes(NodeType.Run, true))
{
    Console.WriteLine("Run LocaleId = " + run.Font.LocaleId);
}

Please note that the LocaleId is an optional parameter in MS Word, which means it may not always be present for every text run. For a comprehensive list of locale identifiers, you can refer to the Microsoft documentation here [1].

If you are looking for OCR capabilities to recognize text in various languages, Aspose.OCR supports multiple languages and can automatically detect the language of the text in images. However, this is separate from the functionality available in Aspose.Words.

For any updates regarding language detection in Word documents, keep an eye on the issue tracking system for the ticket number associated with this feature.

Sources:
[1]: Recognition languages - Aspose Documentation

@alexey.noskov , is the answer of the bot correct?
Is language classification in the roadmap?

@wfbe
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-28219

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.