Hello,
We’re seeing a strange issue when performing a match and replace for text with a particular pdf. What’s strange is we run hundreds of pdfs a day through this code but earlier this week we hit a particular pdf that will exhaust all heap space. Even if the heap is set to 58GB and the pdf itself is only around 80MB. We’ve manually reviewed the pdf in adobe acrobat pro and nothing stands out as off with this file.
I’ve tried chasing this issue with visualvm and from what I can see it looks like the memory exhaustion is being caused by a call to Page.accept() and TextFragementAbsorber.visit(). This is all happening in a loop and should be freeing resources for each page after it’s done scanning them.
I’m including with this ticket a source file that just includes the called methods. If more is needed please let me know.
High level code overview
LivetextHandler.handle() <— entry point
Page page : pagesList ← for loop
LiveTextHandler.searchAndReplace()
LiveTextHandler.getSearchResults(pageNumber, searchString)
TextFragmentAbsorber
Page.accept(textfragementabsorber)
HandlerHelpers.closeAndCleanupPage(page) ← helper method to clean up page to free resources
page.close()
page.freeMemory()
Library versions
aspose.barcode: 24.6
aspose.pdf: 23.1
aspose-visualvm.png (277.4 KB)
sample.java.zip (2.4 KB)
Thank you for your time.