We have been evaluating your software for converting PDF to XML format and we have found some problems. We have a PDF collection of about 1300 files, we have made a console application that batch converts all the files and have found that some give system.outofmemory.exception or stackoverflowexceptions and the following exceptions:
- ErrorValue of '0'
is not valid for 'emSize'. 'emSize' should be greater than 0 and less
than or equal to System.Single.MaxValue.
Parameter name: emSize
- ErrorIndex was out of range. Must be non-negative and less than the size of the collection.
Parameter name: index
We plan on deploying this in a web server that will receive from our customers via browser upload a PDF file and the converts it to XML for our internal consumption. We cannot nor want to pick PDF files, so the software will have to handle anything the users upload.
I have selected all the problematic PDF files and have zipped them in a convenient package for you take a look at.
We have another problem, when the conversion is finish than we use the files in another program to get the sections, but always we receive a message that the document is unreadable.
Please let us know if you are willing to take a look and let us know if an update can be made to broaden the support to these currently problematic files. If it cannot I am afraid that we will have to look elsewhere, as this is a key factor of our evaluation.