Extract text for each page

Hi,

I need to get text from each page to count words and characters per page. Is there a way to split pages and get text form each page correctly?
Pleas advise.

Regards,
Rapeepan

@rcomniscien

Yes, you can achieve your requirement using Aspose.Words. We suggest you please use the PageSplitter utility to extract the page’s text. You can use this utility to split the document’s pages into separate documents. Please get the latest code of this utility from the Github repository of Aspose.Words for .NET .

Once you have extracted the specific page of document using PageSpiliter utility, please use BuiltInDocumentProperties.Words property to get words count of document.

Please use BuiltInDocumentProperties.Characters property to get an estimate of the number of characters in the document.

1 Like

@tahir.manzoor

My bad, I forgot to inform you I use Java.
I found BuiltInDocumentProperties in API Reference for Java. But I cannot found PageSplitter. Is there something similar to this class?

Regards,
Rapeepan

@rcomniscien,

The Java equivalent of ‘PageSplitter’, ‘DocumentPageSplitter’, ‘PageNumberFinder’, ‘SectionSplitter’ classes etc can be found at the following link:

1 Like