Register Hyphenation Dictionary & Convert Word Document with German Text to PDF (C# .NET)

We use Aspose.Words to convert Word-files to Pdf in a solution for a customer.
The customer reports that it often happens, that the formatting in the pdf is not the same as in the wordfile, eg footnotes have a different format and text on a site in Word is moved to the next site in Pdf.
We use Version 18.5.0 for the conversion but i also tried with the newest Version 21.6.0 and 21.7.0 and the problem still exists.
The customer used Word213 until now and has now upgraded to Word2019. I also tried to use the LoadOptions.MswVersion and LoadOptions.LoadFormat to optimize for doxc and Word 2013 or 2019. This also didn’t resolve the problem.
I have an example file to demonstrate the problem but sadly I can’t see an option to attach any files here.
Please contact me, so I can deliver the example Word file and the Pdfs with the different Versions.

@bernerMaettu,

Please first compress your simplified Word document and PDF file to ZIP format and then upload the ZIP file as this forum allows .zip file extensions. You may also upload the .zip file to Dropbox or any other file hosting service and share the download link here for testing.

Example.zip (1.2 MB)
Thanks for your reply. You can find the example word and different resulting pdf’s in the attachment above.
The Problem is at Page 3 where the content of the site is different in the word-file and the pdf’s. There is also a problem with Footnote nr. 3 which isn’t formatted correctly in the pdf’s.

@bernerMaettu,

We have converted your “Dokumentation_Pflege_Phoenix.docx" file to PDF format by using the licensed latest (21.7) version of Aspose.Words for .NET and by using the Save AS command of MS Word 2019 and attached the PDF files here for your reference:

Third page contents in both these PDF files look identical on our end. We used the following simple code on our end:

Aspose.Words.License lic = new Aspose.Words.License();
lic.SetLicense("C:\\Temp\\Aspose.Words.lic");
Document doc = new Document("C:\\Temp\\Dokumentation_Pflege_Phoenix.docx");
doc.Save("C:\\temp\\awnet-21.7.pdf");

Do you see the same problems in these PDF files on your end? If yes, can you please also provide a comparison screenshot highlighting (encircle) the problematic area(s) in Aspose.Words generated output and attach it here for our reference? Please also manually convert “Dokumentation_Pflege_Phoenix.docx” to PDF format by using MS Word and attach MS Word generated PDF as well. We will then investigate the issue further and provide you more information.

Thanks for the quick response

I have attached two sreenshots. Word.Jpg shows, how the document looks opened in Word and Pdf.jpg shows how the Document looks after conversion to pdf by aspose.words. As you can see, there is text missing on the page in the pdf and the formatting of the footnote nr. 3 is different in Word and PDF. I also have attached the PDF that is generated by word directly. Here the formatting shows no difference to the original Wordfile.Word.JPG (94.2 KB)
DokumentationPflegePhoenix.pdf (206.8 KB)
Word.JPG (94.2 KB)
Pdf.JPG (82.6 KB)

@bernerMaettu,

For the sake of corrections, we have logged the following issues in our issue tracking system.

  • WORDSNET-22483: German Text Paragraphs Move to Next Page during Word to PDF Conversion
  • WORDSNET-22484: Footnote Text Formatting not Preserved during Word to PDF Conversion

We will further look into the details of these problems and will keep you updated on the statuses of linked issues. We apologize for your inconvenience.

@bernerMaettu,

Regarding WORDSNET-22483, we have completed the work on this issue and concluded to close this issue with “not a bug” status. It appears that your code does not specify a hyphenation dictionary for the document language (German-Switzerland). Because of that, some words are not hyphenated in Aspose.Words’ output. Specifically, not hyphenating “krankheitsbezogene” in paragraph 5 on page 2 causes the container paragraph to take an extra line in Aspose.Words layout, causing content shifting and leading to the issues you reported here.

So, please register hyphenation dictionary (hyph_de_CH.zip (24.8 KB)) for appropriate language for the correct hyphenation.

Hyphenation.RegisterDictionary("de-CH", "C:\\Temp\\hyph_de_CH.dic");
Document doc = new Document("C:\\Temp\\Docs\\Dokumentation_Pflege_Phoenix.docx");
doc.Save("C:\\temp\\Docs\\awnet-21.7.pdf");

However, even after registration of dictionaries, the document still is not rendered properly because there is another problem related to paragraph rules (Widow Orphan). See out-with-correct-hyphenation.aw.pdf (162.2 KB), as you can see, there is a problem on 7/8 pages. To address this problem, we have logged a separate issue with ID WORDSNET-22495 and linked your thread with this new issue.

@awais.hafeez
I have implemented the hyphenation dictionary and it works as excpected.
Is there any news regarding issue WORDSNET-22495?

@bernerMaettu,

I have verified the status of WORDSNET-22495 from our issue tracking system and regret to share with you that the implementation of this issue has been postponed till a later date and there aren’t any estimates (ETA) available at the moment. We will inform you here as soon as this issue will get resolved in future. We apologize for your inconvenience.

@bernerMaettu,

Regarding WORDSNET-22484, we have completed the work on this issue and concluded to close this issue with “not a bug” status as Footnote text is being hyphenated correctly with the dictionary and code snippet provided earlier.

The issues you have found earlier (filed as WORDSNET-22495) have been fixed in this Aspose.Words for .NET 24.4 update also available on NuGet.