Highlighting text in PDF File

I need to highlight a specific text on PDF file, but I can’t load it in memory anymore. Is there a way to do it using the facade classes and not instanciate the aspose.pdf.Document?

For now, we are using something like that:

TextFragmentAbsorber textAbs = new TextFragmentAbsorber ("xxx", new TextFragmentAbsorber(true));
Document doc = getDoc();
doc.getPages().accept(textAbs);
for(TextFragment frag : textAbs.getTextFragments()){
    frag.getTextState().setBackgroundColor(Color.getYellow());
}
textAbs.reset();
saveDoc(doc);

@ipm.aspose

Do you mean you need to load the document in memory to highlight the text? You can simply initialize the Document object with FileStream. Can you please explain a bit more about the scenario so that we can further proceed accordingly?

Actually, we need to do the operations without loading it in memory. I was looking in the Aspose Documentation, and there are the facade classes with this purpose.

The code that I wrote, is how we do it now, but we can’t manage big pdf files because Aspose consumes a lot of memory to maintain a Document object. Is there another way to do it?

@ipm.aspose

In order to prevent memory consumption, you can use TextFragmentAbsorber on page level. So, instead of doc.getPages().accept(textAbs);, you can do something like below:

for(Page page : doc.getPages()) {
 TextFragmentAbsorber textAbs = new TextFragmentAbsorber ("xxx", new 
 TextFragmentAbsorber(true));
 page.accept(textAbs);
 for(TextFragment frag : textAbs.getTextFragments()){
     frag.getTextState().setBackgroundColor(Color.getYellow());
 }
}

But to get pages, I would need to have the Document instantiated. We want to avoid instantiate it, using a classe like Stamp | Aspose.PDF for Java API Reference. Is it possible?

@ipm.aspose

It is not clear that how you would highlight some text or even find text without initializing or loading the respective PDF document. Even Facades classes do need a document to be bind in order to perform further operations on them. OR maybe we are still unable to get what you are actually asking here. Some more explanation will really be appreciated from your side.

It just that, sometimes the Document is so big that we can’t manage it. I would need to load it page per page, or something like that, but we need to decrease the memory consumption. I thought i could be done using Facade classes, that somehow I would be able to change just what I need, for example, put stamps in specific parts of the document. But when facing highlighing text, it really seems to be impossible.

@ipm.aspose

We can check for the memory consumption related issue if you can please share your sample PDF document with us. We will log an investigation ticket in our issue tracking system and share the ID with you.

It’s not just one PDF, the problem is that sometimes we need to merge PDF files and the result is a document with 100MB+. When it is loaded in a Document object, the memory consumption goes to sky

@ipm.aspose

You can please share any PDF among those which are creating memory consumption issues at your end along with the code snippet that you are using. We will definitely log an investigation ticket and share the ID with you.