Word vs pdf output differences

Hi,
We create many documents in Words. We then output them in either docx or pdf form. We have noticed differences in the outputs. The attached pptx shows one such example.
You will notice that the (53) heading has a different tab spacing in the generated docx vs the pdf file. There are also differences in the wrapping of the paragraph.
Im also attaching the two output routines which both take the same generated Document as an argument.
Is there something else I need to do in those routines to make the docx and pdf outputs look the same?
Thanks
Brad

Hi Brad,

Thanks for your inquiry. It would be great if you please share following detail for investigation purposes.

  • Please attach your input Word document.
  • Please attach the output Word/Pdf file that shows the undesired behavior.

As soon as you get these pieces of information to
us we’ll start our investigation into your issue.

Hi Tahir,
I’ve included four files in the zip:
PDF conversion issue.pptx - shows the wrapping issue on paragraph (53)
ROC-31R-B003_LP.docx - the source file
ROC-31R-B003_LP.pdf - our output file
DocxToPdfRoutine.txt - the routine used to convert docx to pdf
Note the last line in paragraph (53) in the docx file just has ‘activated’, while the pdf has ‘function is activated’.
Thanks for your help,
Brad

Hi Brad,

Thanks for sharing the detail. I have tested the scenario and have managed to reproduce the same issue at my side. For the sake of correction, I have logged this problem in our issue tracking system as WORDSNET-11003. I have linked this forum thread to the same issue and you will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

Tahir,
Thank you for your quick response.
Looking forward to the resolution.
Brad

Hi Tahir,
Attached is another file with a pdf transform problem. This time, the table at the end of the document gets really messed up. I have had other files with a similar table that do not have this problem, but this one is bad.
Brad
PS. After sending the file, I cleaned up the styles in the word doc and then it transformed correctly. So, maybe some oddity with the styles is causing the problem.

Hi Brad,

Thanks
for sharing the detail. I have tested the scenario and have managed to
reproduce the same issue at my side. For the sake of correction, I have
logged this problem in our issue tracking system as WORDSNET-11007. I
have linked this forum thread to the same issue and you will be notified
via this forum thread once this issue is resolved. We apologize for your inconvenience.

Please share your other documents for which you are facing issues. We will investigate the issues and provide you more information.

Tahir,
Attached is another docx/pdf pair with a pdf generation problem.
Notice the second and third paragraphs in the box on page 2 are indented in the pdf, but not in the word file.
Brad
P.S. These pdf problems are causing our reputations (Aspose’s and mine) to be tarnished. Our users are now just creating the docx and using Word to make the pdfs. We want to generate them automatically (and have the code ready to do that when the errors are fixed).

Hi Brad,

Thanks
for your inquiry. I have tested the scenario using latest version of Aspose.Words for .NET 14.10.0 and have not found the shared issue. Please use Aspose.Words for .NET 14.10.0. I have attached the output document with this post for your kind reference.

Please let us know if you have any more queries.

Hi Tahir,
I did download the 14.10.0 version and found that the last problem is not a problem.
Is there a way to track the previous issues (WORDSNET-11003 and 11007)? They were not solved in the 14.10.0 download.
Thanks,
Brad

Hi Brad,

Thanks
for your inquiry. You may ask for the update about the reported issues via Aspose.Words forum. We will check the status of issues in our issue tracking system and update you via forums. However, we have already linked this forum thread to WORDSNET-11003 and WORDSNET-11007. You will be notified via this forum thread once these issues are resolved.

Please let us know if you have any more queries.

The issues you have found earlier (filed as WORDSNET-11003) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.

I downloaded the latest Total package (15.3.0) and copied the files to my bin directory.
The results are shown in the attached file.
It does not appear that WORDSNET-11003 is fixed on my computer.
Do you see anything that looks incorrect in my bin folder or anything else?
Thanks

Hi Brad,

Thanks
for your inquiry. This issue is related to hyphenation. If the hyphenation of the problematic paragraphs is turned off, the layout in MS Word becomes similar to Aspose.Words output (because Aspose.Words does not hyphenate by default).

In order to turn the hyphenation on in Aspose.Words, an appropriate dictionary should be enabled via Hyphenation.RegisterDictionary. Please read following documentation link for your kind reference.
https://docs.aspose.com/words/net/working-with-hyphenation/

Hi Tahir,
I looked at the test document (ROC-31R-B003) and find that all the heading paragraphs do have hyphenation turned off. That is also true of the template file on which they are based. If that is not true of the file I sent, please let me know and I will send another copy.
If there is something else I can do, please let me know. As for now, the pdf output still does not match the word output and WORDSNET-11003 is not solved.
Brad

Hi Brad,

Thanks
for your inquiry. Please open your input document in MS Word and share the hyphenation setting from Page Layout tab. See the attached image for detail. Please set hyphenation setting to ‘none’ and convert document to Pdf using Aspose.Words.

Hi Tahir,
I opened the file and it was Automatic as shown in your picture.
I changed it to None as shown in the attached picture.
When the pdf is generated, it still does not match the docx file.
When this ulimately gets fixed, I may still need to change or view the hyphenation mode in code. How is that done?
Thanks,
Brad

Hi Brad,

Thanks
for your inquiry. I have set the hyphenation to ‘none’ at my side and Aspose.Words generates the correct output. Please check the attached input and output documents.

However, with hyphenation ‘Automatic’, output Pdf differ from input Word document. For the sake of correction, I have logged this problem in our issue tracking system as WORDSNET-11743. You will be notified via this forum thread once this issue is resolved. We apologize for your inconvenience.

In order to turn the hyphenation on in Aspose.Words,
an appropriate dictionary should be enabled via
Hyphenation.RegisterDictionary. Please read following documentation link
for your kind reference.
https://docs.aspose.com/words/net/working-with-hyphenation/

I have used following code example to test the hyphenation issue.

Document doc = new Document(MyDir + "ROC-31R-B003_LP.docx");
Hyphenation.RegisterDictionary("en-US", MyDir + @"hyph_en_US.dic");
doc.Save(MyDir + "Out.pdf");

Hi Tahir,
Thank you for the response.
An interesting thing is happening. The docx file that you returned displays differently in Word than the one I sent (even after I had turned off hyphenation, too)! It does match the output that we get.
Any idea why your file displays differently in Word? Did you do anything besides turn off the hyphenation? I compared the two files in Word and it found no difference.
Thanks for the info on turning on hyphenation. I was looking for how to turn it off in code. Is it just automatically off if I don’t add a dictionary? Doesn’t the docx have that encoded somehow if hyphenation is on in the document?
Thanks for your help.
Brad

Hi Brad,

Thanks
for your inquiry.

*brad75552:

Any idea why your file displays differently in Word? Did you do anything besides turn off the hyphenation? I compared the two files in Word and it found no difference.*

I did not change your document except turning off the hyphenation.

*brad75552:

I was looking for how to turn it off in code. Is it just automatically off if I don’t add a dictionary? Doesn’t the docx have that encoded somehow if hyphenation is on in the document?*

Yes, if you do not use dictionary, the hyphenation will off. Hyphenation class provides methods for working with hyphenation dictionaries. These dictionaries prescribe where words of a specific language can be hyphenated.

Hyphenation.IsDictionaryRegistered method returns False if for the specified language there is no dictionary registered or if registered is Null dictionary, True otherwise.

Hyphenation.UnregisterDictionary method unregisters a hyphenation dictionary for the specified language.