i’m using Aspsoe.Pdf.dll 18.104.22.168 to extract text from the attached pdf document.
The extraction process delivers a correct result but it’s very slow and consumes a lot of memory. It seems that it doesn’t ignores the embeeded pictures which slows down the whole process.
I know the pdf document isn’t well build. It’s created from a GIS System and could be be constructed more efficient. But our customers will produce a lot of documents in this style in the near future.
So maybe you can tune the pdf extraction routine a little bit.
Best regards, Martin
Hi Martin,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
Thank you for sharing the template file.
I have tested your scenario and you are right. It is taking some time to extract the text from the PDF file. I have registered an issue in our issue tracking system with issue id: PDFNEWNET-33809 for our development team to further check this issue. I will update you via this forum thread regarding the updates.
Sorry for the inconvenience,
Hi, I have the same problem do you have some news?