I want to iterate over the text fragments that I got by calling “TextFragmentAbsorber.visit”. It works in many cases, but some files caused an OOM error:
Caused by: java.lang.OutOfMemoryError: Java heap space
at com.aspose.pdf.internal.l2u.l0h.lI(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lI(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lI(Unknown Source)
at com.aspose.pdf.internal.l2u.lu.lI(Unknown Source)
at com.aspose.pdf.OperatorCollection.lb(Unknown Source)
at com.aspose.pdf.OperatorCollection.ld(Unknown Source)
at com.aspose.pdf.OperatorCollection.size(Unknown Source)
at com.aspose.pdf.internal.l5if.ly.ly(Unknown Source)
at com.aspose.pdf.internal.l5if.ly.ly(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.le(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.<init>(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.<init>(Unknown Source)
at com.aspose.pdf.TextFragmentAbsorber.visit(Unknown Source)
The source:
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("[\\S ]+",
new TextSearchOptions(true));
textFragmentAbsorber.visit(pdf);
Java 14 (openjdk), aspose-pdf 21.5.
test1.pdf (1.6 MB)