We currently have purchased license for Aspose.Word and Aspose.Pdf and we are in process of evaluating Aspose.Pdf.Kit.
We used the evaluation version of the Aspose.Pdf.Kit to extract text from a PDF to display it on a RTE (Rich Text Editor) and plain text box. We noticed that the extracted text does not retain formatting.
Is it possible to retain formatting? We would need the HTML version of the extracted text.
From the documentation for Aspose.Pdf.Kit, I noticed that there was no mention about retaining formatting when we extract text.
I also noticed from the documentation for Aspose.Recognition (for .net), that it converts the PDF to Word which would allow us then to extract html from the word doc using Aspose.Word. However this is dependant on a .net env and our production environment is unix/java.
Is there any other product of Aspose that we can use to get this functionality?