Document language is changed from Hebrew to Arabic after DOCX>HTML>DOCX using .NET

Hi,
When we use aspose to convert HTML to MS Word, the language detected by word is incorrect. Instead of Hebrew, it detects it as Arabic.
Can you please fix this issue?

@omri-1

To ensure a timely and accurate response, please attach the following resources here for testing:

  • Your input HTML document.
  • Please attach the output Word file that shows the undesired behavior.
  • Please create a simple application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

Attached 2 docs:
1 - the original doc
2 - the doc after converting doc 1 to html and convert it back to doc (with stds)
aspose 29.zip (58.3 KB)

@omri-1

Thanks for sharing the detail. Please create a simple application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing.

Thanks for your cooperation.

Code:
new Aspose.Words.Document(@ā€œE:\WordTest\31.docxā€)
.Save(@ā€œE:\WordTest\31.htmā€, new HtmlSaveOptions
{
HtmlVersion = Aspose.Words.Saving.HtmlVersion.Html5,
ExportImagesAsBase64 = true,
ExportHeadersFootersMode = Aspose.Words.Saving.ExportHeadersFootersMode.None,
ExportListLabels = Aspose.Words.Saving.ExportListLabels.AsInlineText,
});
new Aspose.Words.Document(@ā€œE:\WordTest\31.htmā€)
.Save(@ā€œE:\WordTest\31_2.docxā€, SaveFormat.Docx);

31.docx language: Hebrew
31_2.docx language: Arabic

@omri-1

We have tested the scenario and have managed to reproduce the same issue at our side for both documents. For the sake of correction, we have logged the problems in our issue tracking system as WORDSNET-18871 and WORDSNET-18872. You will be notified via this forum thread once these issues are resolved.

We apologize for your inconvenience.

Can you please update with a fix or ETA for this one?

@omri-1,

Unfortunately, there are no estimates available at the moment. We will keep you posted on any further updates on these issues.

This really mess up the spell checking in word and it is very important to us.

@omri-1,

I am afraid, there is no further news on these issues so far. These issues have ā€˜Normal Priorityā€™ in our issue tracking system. If these issues are important to you, and for the fast resolution of these issues, please have a look at the paid support options - e.g. purchasing ā€˜Paid Supportā€™ will allow you to post your issues in our Paid Support Helpdesk and raise the priority level of these issues. Many ā€˜Paid Supportā€™ customers find that this leads to their issues being fixed in the next release of the software.

If you would like to take advantage of the ā€˜Paid Supportā€™ then please request a quote in our purchase forum - Aspose.Purchase - Free Support Forum - aspose.com.

If you already have a Paid Support account, then please use it to directly post your important queries in Paid Support Helpdesk .

We apologize for any inconvenience and thank you for your understanding.

@awais.hafeez based on promises like these, we have purchased Paid Support, and all of the issues we have opened via Paid Support did not get resolved as well, we have issues in Paid Support that are waiting for over 5 months!

So I donā€™t think that Paid Support is the answer (I wish it was that simple). From our point of view, this is a ā€œfalse promiseā€ that got us to pay more for Paid Support while our issues did not get resolved.

@omri-1,

Please communicate your concerns in our Paid Support Helpdesk by commenting directly against your tickets as this is the quickest option to get updates about your high priority issues. Thanks for your cooperation.

@awais.hafeez Iā€™m doing that every monthā€¦

@omri-1,

Thank you for reporting this situation to us and we are very sorry for the inconvenience caused to you due to these issues. We are going to escalate this situation regarding your issues, so we could investigate and find out what went wrong here. We will keep you posted on further updates in your paid support threads.

Thank you @awais.hafeez

@omri-1

The status and progress of your tickets have been updated in our paid support system, so please check back in the helpdesk and also discuss with paid support team there for any further concerns.

The issues you have found earlier (filed as WORDSNET-18872) have been fixed in this Aspose.Words for .NET 20.3 update and this Aspose.Words for Java 20.3 update.

Is has been partially fixed. If you try the files Iā€™ve attached you will see that now some of the words are recognized as Hebrew, some English and some as Arabic (while they are all Hebrew and English)

@omri-1

Please set HtmlSaveOptions.ExportLanguageInformation to true as shown below to get the correct output.

new Aspose.Words.Document(MyDir + @"2.docx").Save(MyDir + @"31.htm", new HtmlSaveOptions
{
    HtmlVersion = Aspose.Words.Saving.HtmlVersion.Html5,
    ExportImagesAsBase64 = true,
    ExportHeadersFootersMode = Aspose.Words.Saving.ExportHeadersFootersMode.None,
    ExportListLabels = Aspose.Words.Saving.ExportListLabels.AsInlineText,
    ExportLanguageInformation = true
    });

new Aspose.Words.Document(MyDir + @"31.htm")
.Save(MyDir + @"31_2.docx", SaveFormat.Docx);

@omri-1

After further investigation, we have noticed that WORDSNET-18872 is partially fixed. We have logged new issue for this case as WORDSNET-20108. We apologize for your inconvenience.

1 Like