Free Support Forum - aspose.com

Issues coming up while converting Word doc to Pdf

Hi,

We have been using Aspose Total for generation of reports in our application. We added a new functionality of converting Word doc to Pdf and we are facing some issues in it.

Please find attached a word doc created using Aspose.Words and which, on conversion to Pdf using Aspose.Pdf, has issues. These are the issues found:

1) Some text and lines become bold even though they are not so in the original word file.

2) The diagonal line in the form (別紙様式3), Page 3, last table is lost after conversion to Pdf.

3) The left most vertical line of the table in Form (別紙様式4) is lost on some pages in the Pdf.

4) An extre row gets added to the end of each table in the Pdf.

5) In form (別紙様式第7) the orientation for the vertical text, 医薬品の名称, changes and it gets rotated.

I could have attached the converted Pdf file too but i could not find a way to attach two files in this message. So please check it on your side by converting it to Pdf.

The report attached has to be sent to Regulatory Authorities and the template is provided by the Authorities only. So we need a way to maintain the template design across conversions.

It would be really helpful if you could look into these issues and come up with some solution. Please do let me know the Issue# in case these issues can not be resolved immediately and you keep track of it through an Issue# so that we can also keep a track of it through the same number.

Thanks in Advance,

Gaurav

Hi,

When I am trying to convert the doc to pdf I get an Out of Memory exception. Can you please tell us which version of Aspose.Pdf and Aspose.Words are you using. Can you also provide us with pdf that you got and the code that you are using.

Thanks.

The out of memory exception is thrown by Aspose.Words so I move this post to Aspose.Words forum. We will investigate the formatting issues when this bug is fixed.

Hello!<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for your inquiry.

I have reproduced the exception with the latest version of Aspose.Words and logged it as #4899 in our defect database. We’ll notify you here in the thread when it is fixed.

I can state without looking conversion results that Aspose.Pdf doesn’t support diagonal borders (referred as 2). We discussed that with Aspose.Pdf Team and they planned to implement diagonal borders in the future. All the remaining should be investigated when #4899 is fixed. It seems that something of that has been already fixed.

Regards,

Hi,

The version of Aspose.Words.dll is 4.3.0.0 and that of Aspose.Pdf is 3.6.1.0.

We never got any out of memory exception while converting the file. Maybe it could be a bug introduced in some newer versions of the dll.

I am attaching along the pdf output file for your reference.

Thanks for looking into this issue. Will look forward for any updates.

Hello!<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for applying additional materials.

I’m sorry. This is seemingly a regression in Aspose.Words since one of the previous versions opens this document but 5.1.0 throws an exception. The library is always being changed in the development process. That’s life. We’ll try our best to resolve this defect. I’ll comment your complaints in the original order.

1) Some text and lines become bold even though they are not so in the original word file.

I don’t see text that becomes bold accidentally. What is bold in the DOC also comes out bold in PDF. Please clarify what pieces of text you mean. (I don’t know this language so it could be difficult for me to find them out.)

Some table borders really get bolder than they are in the source. This is a known issue. It is fixed in the latest versions of Aspose.Words and Aspose.Pdf. Let’s wait when #4899 is fixed and they’ll be okay.

2) The diagonal line in the form (別紙様式3), Page 3, last table is lost after conversion to Pdf.

As I wrote diagonal borders are not supported by Aspose.Pdf. In Aspose.Words defect database it is #4907 and in Aspose.Pdf it is PDFNET-4684. Note that Aspose.Pdf should implement this first.

3) The left most vertical line of the table in Form (別紙様式4) is lost on some pages in the Pdf.

This is an issue with Aspose.Pdf. Sorry, I don’t know the number. But Aspose.Pdf Team is working on that. The problem shows up if one table cell spans more than one page vertically.

4) An extra row gets added to the end of each table in the Pdf.

These rows are also present in the source document. There is no surprise. They just look a little higher but now I cannot say exactly why that happens. If you don’t want them there you can remove those rows from the document. We support double borders natively. Is it an intention to simulate them?

5) In form (別紙様式第7) the orientation for the vertical text, 医薬品の名称, changes and it gets rotated.

I see one cell it that table with vertical text direction. Both Aspose.Words and Aspose.Pdf are aware of this feature. We output appropriate attribute and Aspose.Pdf renders vertical text. I don’t know the rules how these hieroglyphs should be written vertically. But I see they are rotated if compared with the original. Please discuss this with Aspose.Pdf Team. If this is a defect it should be fixed in Aspose.Pdf.

Regards,

Hi,

Thanks for the reply. Following are my comments for the points in the same order:

1) On the first page of the word file, there is table which has text in all the rows. The text there is not bold at all. But if you look at the same text in the Pdf file the text seems to be bold.

2) and 3) I will keep track of this issue here using the issue number you have mentioned. But how do i find out if an issue has been taken care of and is there in the latest release?

4) For highlighting the extra row for the tables i am attaching along a picture which shows the extra row highlighted in a red box.

5) How do i contact the Aspose.Pdf team? Do i need to go to a different forum or something? I thought the issue would be addressed in this forum only.

Thanks

Hello!<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

I checked again this table and confirm that nothing is bold in PDF. If anything looks like bold it can be specific to applied font. Please note that PDF will never be 100% exact to the source document due to many reasons. One of them is that fonts with same names have different typefaces in Adobe. This is trivially reproduced with any popular fonts (Arial, Times New Roman) and Latin letters. Hieroglyphic fonts can have more specific differences. I can suggest either experiments with changing fonts or consulting with Aspose.Pdf Team since they know more about PDF processing and Adobe applications.

I see the row highlighted with red. That’s exactly what I wrote about. These rows are present in the source document. To check this fact you can open the document in MS Word, navigate to that place and set view scale to 500%. Now you can snap each border line separately with mouse cursor. Do you see how the cursor changes for resizing them? Also you can view or change properties of these artificial rows. Please let us know what’s wrong with them and what you’d like to change: either completely remove or replace with double border on the previous rows, or anything else.

You don’t need to remind Aspose.Pdf Team since I already did this yesterday. If they don’t reply for long you can post right to this thread further. Even in spite of the fact they moved the thread to Aspose.Words forum, notifications will be sent to Aspose.Pdf developers who at least once posted here.

Regards,

Hi,

I am a developer of aspose pdf team.
Firstly, We need Aspose.Words’s fix to resolve the out of memeory problem in order to investigate those issues that related to Aspose.Pdf. Viktor please send us an update once it is ready. Thanks.
I will comment the following Aspose.Pdf regarding issues

<o:p> </o:p>


2) The diagonal line in the form (別紙様式3), Page 3, last table is lost after conversion to Pdf.

Diagonal borders are not supported by Aspose.Pdf at present. We plan to support it next month. The log id is PDFNET-4684.


3) The left most vertical line of the table in Form (別紙様式4) is lost on some pages in the Pdf.

We will check it with Aspose.Words's fix.

5) In form (別紙様式第7) the orientation for the vertical text, 医薬品の名称, changes and it gets rotated.

This seems to be a known issue which has been fixed in Aspose.Pdf 3.6.2.2.

For issue 3 and 5, I think they can be fixed in short time. For issue 2, it is a new feature and may take about 20 days to implement. We will send you an update here as soon as we finish it. Thanks.


Best regards.



Hello!<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

#4899 has been fixed in the current codebase. Please expect correct behavior in the very next release. I asked Hans to check whether everything else okay with this document.

Regards,

Hi,


I have got the fix from Viktor and was able to check those issues.

2) The diagonal line in the form (別紙様式3), Page 3, last table is lost after conversion to Pdf.

It had been logged as PDFNET-4684. It will be fixed in next month.


3) The left most vertical line of the table in Form (別紙様式4) is lost on some pages in the Pdf.

I have tested it with our latest version and can't reproduce the error. It should have been fixed.

5) In form (別紙様式第7) the orientation for the vertical text, 医薬品の名称, changes and it gets rotated.

The problem is still there. I opened the doc with MS Word and found the text is set to be vertical. The problem is that MS Word support more kinds of rotation styles than Aspose.Words. This style is not supported in Aspose.Pdf. As a workaround, please remove its rotation attribute and break those words line by line(one word in one line) manually.Sorry for inconvenience.
We will send you an update here once we have finished PDFNET-4684. Thanks.

Best regards.

The issues you have found earlier (filed as 4899) have been fixed in this update.

Thanks a lot for all your help.

Can you also post the link for the latest Aspose.Pdf which has the fix for "the vertical line being lost during conversion" issue so that i can download that too and test it here that everything is working fine now.

Hello!<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

You can try using the DLL posted here:

This version is preliminary (3.6.2.8) but it was uploaded for another customer by Aspose.Pdf for .NET project manager and may be trusted. At this moment it is the most recent version of Aspose.Pdf.

Regards,