Extract pdf text by section wise/book mark

panthagani · January 6, 2009, 3:40am

I am trying to build an application which requires me to chunk the pdf file into small parts and extract the data and save in database to maintain versioning so it will be easy to edit only required section if needed instead of going through all the file .

codewarior · January 6, 2009, 5:39am

Hi,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for considering Aspose.

Adding more to our conversation session, Aspose.Pdf.Kit extracts text as a whole from Pdf file, and you need to identify the sections programmatically, then save the text as chunk into database.

For information on how to extract text from Pdf file, please visit http://www.aspose.com/documentation/file-format-components/aspose.pdf.kit-for-.net-and-java/extract-text-from-pdf-document.html

For information on working with database, please visit http://www.aspose.com/documentation/file-format-components/aspose.pdf.kit-for-.net-and-java/interoperate-with-database-net.html

NOTE: This is Beta version of PdfExtractor. Some features may not be supported well and we may be not able to fix them in short time. Also extraction of non-English text is not yet supported.