Hi,
We are developing a product and interested to buy aspose pdf toolkit and other packages (word/presentation etc.) as our support for analyzing those documents.
I downloaded the trial version and did a comparison between Aspose & Apache pdfbox(Just extracting the plain text from those documents), but the result seemed not favourable for your product. Neither the usage of processing time nor the robust for different pdf version.
Aspose pdf toolkit is about 20% slower than apache pdf box at the processing speed on average, and it throws an exception while it is parsing a normal pdf1.6 document. Apache pdf box can handle all my test documents well.
As a commercial product, I believe your product should be better than those open source libraries. I’d like to hear from you on our findings.
Thanks
Hi,
Thank you for considering Aspose and reporting the issues.
Even though Aspose.Pdf.Kit aims the goal of being one of pdf toolkit with most versatile capabilities, we are always glad to improve any existing functions of our product. Could you please attach the pdf here to let us invesigate the issue in detail?
As for the feature of text extraction, we have enhanced it in our .NET edition just now. And then we will rewrite code for the feature of text extraction, as well as the features of image conversion&extraction, for our Java edition, with the experience we've got from .NET. The process will take several months.
Thanks,