We have already used Aspose.Words to retrieve the images/tables along with the corresponding text.
We managed to do that because we can iterate section by section, paragraph by paragraph and also we can get the images under a paragrpah using ‘paragraph.GetChildNodes(Aspose.Words.NodeType.Shape, true)’.
Now our word documents have to be replaced by PDF files and we would like to retain the same functionality. I have gone through the Aspose.PDF documentation and also saw the examples provided in the documentation, but I could not find a way to achieve our need. Could you please let me know if it is indeed possible to retrieve images/tables (along with text) from a PDF file.
Just an example:
We need to extract images and the corresponding text like this:
Document 1 : Paragraph 2 + Image 1
Document 2 : Paragraph 2 + Image 2
Document 3 : Paragraph 3 + Image 3
So basically we need to retrieve the image and its preceding paragraph. We have already done it using Aspose.Words. But can not find a way to do that in Aspose.PDF.
Thanks in advance.