Hi,
iàve read the tutorial how search text on all paged of PDF document.
Searching on the forum i’ve found how to use Regular Expression for searcing “with ignore case”.
Now, when i’ve found a term, i would like to extract the entire paragraph and not only the single word. This is my actual code:
</div><div><div><span class="Apple-tab-span" style="white-space:pre"> </span>Document pdfDocument = new Document("document.pdf");</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>TextFragmentAbsorber absorber = new TextFragmentAbsorber("(?i)stringtosearch", new TextSearchOptions(true));</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>pdfDocument.getPages().accept(absorber);</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>TextFragmentCollection collection = absorber.getTextFragments();</div><div><span class="Apple-tab-span" style="white-space:pre"> </span></div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for(TextFragment fragment : (Iterable<TextFragment>) collection)</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>{</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>for(TextSegment segment : (Iterable<TextSegment>)fragment.getSegments())</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>{</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>System.out.println("Text: " + segment.getText());</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>}</div><div><span class="Apple-tab-span" style="white-space:pre"> </span>}</div></div><div>