Thanks for the reply.
I am including a test java class for your convenience. Also, I am including the pdf document with some representative text that we generally have in our use case. Please note that it would be a string containing text from multiple charsets in the Unicode (Chinese, Japanese, Korean etc)
input-cjk.pdf (250.9 KB)
ResetSegmentDemo.zip (854 Bytes)
From our tests, it is seen that the string that we have in our use case which translates to a TextFragment, constitutes 40 - 200 TextSegments within.
As mentioned before, we are forced to operate at the level of TextSegments since we are not able to keep the TextSegmentCollection of the TextFragment intact when we do a setText on the TextFragment.
The problem with this is that it 's taking 15 ms per TextSegment to setText to EMPTY STRING. So, with 100 TextSegments on an average per each TextFragment, it would be 100 * 15 ms = 1.5 seconds for each string which is prohibitively slow for our use case which could involve 1000 - 16000 of such strings in a single PDF.
As mentioned in my previous comment, we are looking for
- An improvement in the performance of the setText method on the TextSegment
AND / OR
- A new method in the TextFragment API - resetTextSegmentsText() which keeps the TextSegmentCollection of the TextFragment intact while emptying the text of the segments.
to address our concerns with the performance.