I want to iterate over the text fragments that I got by calling “TextFragmentAbsorber.visit”. It works in many cases, but some files caused an OOM error:
Caused by: java.lang.OutOfMemoryError: Java heap space
at com.aspose.pdf.internal.l2u.l0h.lI(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lI(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lI(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lI(Unknown Source)
at com.aspose.pdf.OperatorCollection.lb(Unknown Source)
at com.aspose.pdf.OperatorCollection.ld(Unknown Source)
at com.aspose.pdf.OperatorCollection.size(Unknown Source)
at com.aspose.pdf.internal.l5if.ly.ly(Unknown Source)
at com.aspose.pdf.internal.l5if.ly.ly(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.le(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.<init>(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.<init>(Unknown Source)
at com.aspose.pdf.TextFragmentAbsorber.visit(Unknown Source)
The source:
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("[\\S ]+",
new TextSearchOptions(true));
textFragmentAbsorber.visit(pdf);
Could you please try to use the 21.6 version of the API and if issue still persists, please share some more details like OS Name and Version. We will further proceed to assist you accordingly.
We have tested the scenario using Java 14 in Windows and did not replicate the error. However, we are preparing the environment to test the case under macOS and will get back to you in a while.
We can not reproduce the issue on MacOS Big Sur with OpenJDK 14.0.2. You may try to increase the heap size to avoid out of memory exceptions because this does not seem to be an issue with the API.
I have been able to reproduce the issue on our end. A ticket with ID PDFJAVA-40645 has been created in our issue tracking system to further investigate the issue on our end. This thread has been linked with the issue so that you may be notified once the issue will be fixed.
We are afraid that we cannot share any workaround yet because the ticket has not been yet fully investigated. Nevertheless, we will surely inform you as soon as we have some updates in this regard. Please spare us some time.