Aspose Words SaveAsPdf not honouring proofing lanaguage

Good morning,

We have found that when we set the proofing language in the word document and convert it to a pdf using the latest version of Aspose Words that it does not set the language property on the pdf document to that used for proofing in word.

Please can you advise if this is expected behavior and how we can set the language (if necessary). This is required to meet EU laws on accessibility.

Thank you,

Stan

@modern.gov

Could you please ZIP and attach your input Word document along with problematic and expected output PDF files here for testing? We will investigate the issue and provide you more information on it

Tahir,

Please find attached my test program, select button 3 ‘Welsh Proofing Test’.

If you check the Language setting for the produced pdf it says en-GB while the text in the word document is in Welsh, and has its proofing language set as such. I would have expected Aspose to set the pdf’s tagged content’s proofing language based on that of the text in the source word document.
AsposeDOCProperty.zip (21.6 KB)

@modern.gov

We have tested the scenario using the latest version of Aspose.Words for .NET 21.1 and have not found any issue with proofing. Please note that Aspose.Words mimics the behavior of MS Word. If you convert your document to PDF using MS Word, you will get the same output.

Hi Tahir,

Please can you advise on how you checked the proofing language in the pdf document, as I will need to verify this before we proceed.

Thank you,

Stan

@modern.gov

Please note that Aspose.Words mimics the behavior of MS Word. If you convert your document to PDF using MS Word, you will get the language setting as en-GB.

Hi Tahir,

If I have a word document with a paragraph in it; and that paragraph is set to have cy-GB as it’s proofing language. Then I convert it to a pdf then I assume that the equivalent paragraph in the pdf is also marked as being Welsh.

What I am asking is how, with Aspose Pdf I can see what the proofing language for that paragraph in the pdf is.

Thank you,

Stan

@modern.gov

We have logged this problem in our issue tracking system as WORDSNET-21621. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

Hi Tahir,

I’m confused as to why you have raised a bug. Let me restate my understanding and what I need and hopefully you may be able to assist without raising a ticket.

Here is a word document WelshAndEnglish.zip (10.2 KB)

The first paragraph is in Welsh and has the proofing language set as cy-GB.
The second paragraph is in English and has the proofing language set as en-GB.

I export this to pdf using Aspose.Words, ensuring that I export document structure so that these two paragraphs are in the TaggedContent. In an earlier message you have said you had “not found any issue with proofing” as such I would expect that the produced pdf would have the first paragraph set with a language of Welsh and the second paragaph set with a language of English.

I am not saying that this is not the case, what I am asking is how using Aspose Pdf I can open that pdf document and confirm that the first paragraph’s language in TaggedContent is still cy-GB and that the second one is still en-GB. It’s a code question, I need to be able to check to satisfy myself that the correct language has been copied across in to the tagged content.

I hope that clarifies,

Thanks,

Stan

@modern.gov

As shared in my old post, Aspose.Words mimics the behavior of MS Word. So, PDF files generated by Aspose.Words and MS Word has same output.

In your code, you are using TaggedContent.SetLanguage method to set the language of document using Aspose.PDF. Aspose.Words does not has this feature. So, we logged new feature as WORDSNET-21621 for your requirement.

Please not that this feature is related to reading option under File>Properties>Advance tab. See the attached image for detail. Reading Option - Language.png (60.1 KB)

If your requirement is different, please let us know.

It would be great if you please ZIP and attach your expected output PDF document. We will then check the implementation of this feature and provide you more information on it.

Moreover, please share the complete steps that you are using to detect the language of paragraphs in PDF document. Thanks for your cooperation.

Hi Tahir,

I had been able to obtain a third party tool that allows me to inspect the tagged content of the document; and in doing this I have been able to confirm what you say (that the tagged content does carry the proofing language).

Because that meets the requirements I am working towards and I have been able to demonstrate that the functionality works, I am satisfied that nothing more is presently required. Please feel free to close WORDSNET-21621.

Please also accept my thanks for your assistance with this,

Thank you and all the best,

Stan

@modern.gov

Thanks for your feedback. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.

@modern.gov

We have closed the issue (WORDSNET-21621) as ‘Not a Bug’.

Please note that Aspose.Words mimics the behavior of MS Word. If you perform the same scenario using MS Word, you will get the same output.

When saving document to PDF, MS Word sets the default document language according to the default editing language in MS Word settings. If particular run nodes language differs from the default value then it is exported via logical structure/marked content.

Aspose.Words does not have access to the default editing language in the MS Word settings and sets the default PDF document language according to the default locale. Aspose.Words also marks the runs nodes with different language via logical structure/marked content.