Free Support Forum - aspose.com

PDF Convert loses information that is in Word document

After posting this on the Aspose PDF forums, I was told to post it here. so here is the problem and their information regard what they found.

Problem: We have converted some of the PDFs using the Aspose.Words and PDF and it seems that the PDF is missing some of the text that is in the Word document.  When using an other PDF writer it comes out the same as the Word Document.

What they found: I have tested the code and was able to reproduce the error. This is caused by incorrect values for First Line Indent and Left Indent in the xml generated by Aspose.Words. Please post this problem on the Aspose.Words forum so that they can look into it. In order to avoid this for now please set the Indents as I have done in the attached file.

I will attach the start file for you.

I can send the out put from Aspose Word/PDF and the output from another PDF Writer so that you can see the difference. Just give me your email and I will send it since I can only attach one file here.

Basically you will notice that in the Aspose output the Market Area on the top right is missing.

Hello Michael.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for your inquiry.

I have tried converting this document and reproduced the issue. We’ll take it in consideration. But I would like to note that the header of page 1 could be formatted using better approach. In particular we can use table-based formatting and putt the elements such as images and text in separate table cells. It is usually very stable in any conversions. If you would like me to refactor this document please give me your e-mail address so I could send the result back privately.

If a subject was discussed in any other thread including other forums you can just put a link here, not re-printing what was found and suggested.

Regards,

I appreciate the note that it could be formatted much better. Believe me, I totally agree, and I have explained that to the department generating these documents. Going forward, we are working with them to try and do so. The challenge faced, though, is that we have over 20,000 documents that are already setup and as we go through and import them into our system (ad-hoc, as we need to), it will take way too long to re-format each one. In the end, the approach we are looking for with the PDF conversion is more of a WYSIWYG. I appreciate your thoughts. Any additional help would be very much appreciated.

Hello Michael.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

These circumstances change the problem definition. So the best way to fix it couldn’t imply changing documents at all. Maybe we could change them programmatically in batch mode. Formatting is really “advanced” so full fix could be very difficult and will need cooperation with Aspose.Pdf team.

I have found the corresponding thread in Aspose.Pdf forum and asked Aspose.Pdf team for their expertise. Here is the link to the original thread for a case we’ll need it: http://www.aspose.com/Community/Forums/thread/106435.aspx.

Are the 20,000 existing documents similar regarding this issue? I mean whether they have the only such formatting problem in the 1st page header and all these headers are similar in terms of some parameterization. If we suggest doing automated reafctoring then we need to formalize the task: highlight potential differences such as date, promo def, marker area etc. Then we can make a template of the 1st page header, replace the existing headers with it and substitute some elements programmatically for every document. Yes, that’s not easy but I’ll help you. Also let’s discuss your alternatives and look what Aspose.Pdf team can advise.

Have a nice day,

Hi,

I have logged it as PDFNET-4202. We will fix it and give you an update in about one week.

Best regards.

Excellent. I will look foward to hearing from you soon.

Anything new on this? It have been about two weeks (I know there where holidays in there, so I waited a little longer to ask).

Hello!<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

This will be fixed in Aspose.Pdf as they promised. I’ll write them an e-mail to remind.

Thank you,

Hi,

Sorry for the delay. We are still working on it and the priority is top. I will ask our developer to give you an ETA soon.

Best regards.

Hi gardmica
I hope we can fix the bugs in about one week,sorry for any inconveniences caused by these problems.
Thanks.

What is the status on this issue? We have had to roll out to production already without the use of your product.

Hi,

Sorry for the delay on your issue. We are working hard on your problems. It seems that this problem relates to some other issues. We will get these resolved and give you a fix soon.

Best regards.

Hi,

The bug has been fixed. Please try the attachment before we publish new hotfix. Thank you very much for your patience.

Best regards.