We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Does Aspose support text extraction from PDF files?

Folks,

I have looked at the Aspose.PDF API and I do not see any way to extract text from a PDF file. Am I just looking in the wrong place? I see the Pdf object but no way to get anything out of it.

I need to be able to:
a. extract textual content from PDF
b. extract metadata from PDF (standard and any custom fields)
c. support for I18N is required, most importantly German, Chinese, Japanse, Korean, as well as Hebrew and Arabic.
d. support for encrypted PDF's (assuming passwords are provided)
e. support for very large PDF's (e.g. 2 GB)

Please let me know.
Thanks.
- Dmitry

Hi Dmitry,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for considering Aspose.

Aspose.Pdf.Kit is a component which is used to manipulate any existing Pdf file.

a) For information on how to extract textual content, please visit http://www.aspose.com/documentation/file-format-components/aspose.pdf.kit-for-.net-and-java/extract-text-from-pdf-document.html

b) For information on how to extract Metadata, please visit http://www.aspose.com/documentation/file-format-components/aspose.pdf.kit-for-.net-and-java/show-pdf-information.html

c) Regarding support for I18N, I am sorry to inform you that Extracting non-English text is not supported. Also currently we have not supported the extraction of text from PDF in 64bit OS. Features not supported

d) support for encrypted PDF's, please visit Decrypt PDF Document

e) Support for large Pdf files, currently Aspose.Pdf.Kit have no limitation of size. Please try our product and in case any issue persists, feel free to share.