Free Support Forum -

Invisible (zero-width) space characters not causing line breaks in PDF converted from DOCX file

We have a DOCX file that contains, among other resources, a long string of numbers and characters separated at various locations by a specific character. We decided to place the unicode zero-width breaking space character \u200B directly after each of these separating characters.

In our DOCX file, this produces the expected result, which is nicely-formatted line breaks.

But, when converting this DOCX file to a PDF document using Aspose.Words, the characters no longer offer the lines a chance to break. Is this a known issue, and if so, is there an alternate character we can use that will satisfy the PDF converter?

I have attached a screenshot of the DOCX (properly formatted) on the top, with the converted PDF on the bottom. Note how the PDF only breaks on actual spaces and hyphens, while the DOCX breaks on our unicode character.
Hi Craig,

Thanks for your inquiry. Could you please attach your input Word document (.docx file) here for testing? I will investigate the issue on my side and provide you more information.

Best regards,

Thank you very much for the quick response. Attached is the Word document that we’re converting.

Please let me know if there’s any additional information/data you need.

Hi Craig,

Thanks for your inquiry. I have attached a PDF file that is produced on my side here for your reference. Could you please also create and attach here a screenshot that highlights the problematic areas in this PDF file? I will investigate the issue further and provide you more information.

Best regards,

Thank you for your response, I apologize for the holiday delay.

Attached is a screenshot of a comparison of (1) the PDF output you provided and (2) my original Word document.

Please note how the content in the Word document properly fills the width of the table, breaking on the invisible spaces, while the PDF document seems to only break on the dashes, and on actual spaces in the word-based content. Also note that this premature line breaking in the PDF output causes the PDF version to require 4 lines of text to print the contents, while the Word document only uses 3.

Thanks again, let me know if you need any additional information.

Hi Craig,

Thanks for the additional information.

While using the latest version of Aspose.Words i.e. 11.11.0, I managed to reproduce this issue on my side. I have logged this issue in our bug tracking system. The issue ID is WORDSNET-7585. Your request has also been linked to this issue and you will be notified as soon as it is resolved.

Sorry for the inconvenience.

Best regards,

The issues you have found earlier (filed as WORDSNET-7585) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.