TextFragment's setText to an EMPTY STRING alters the TextSegmentCollection of the Fragment


#1

TextFragment’s setText to an EMPTY STRING deletes all TextSegments that constitute the TextFragment and leaves a single Textsegment with EMPTY STRING.

If we iterate over the TextSegments and setText on it…the performance takes a hit since setting each text segments takes around 20-30 ms and with a TextFragment containing 50 TextSegments… it takes 20 * 30 = 600 ms.

The performance of the SetText on TextFragment is good but our use case requires the original TextSegments intact after the SetText.

**Our Ask ** : Can a method (probably called resetSegmentCollectionText() on TextFragment be provided which keeps the TextSegmentCollection intact while emptying the text on the TextSegments… ?


#2

@athota

Thanks for contacting support.

Yes, you are right about the case. The API clears the TextSegmentCollection once its parent TextFragment is set to empty string.

This requirement needs investigation and we have to analyse if this is feasible or not. Would you please provide your sample PDF document with sample code snippet. We will further log a ticket in our system and share ID with you.


#3

Hi Asad

Thanks for the reply.

I am including a test java class for your convenience. Also, I am including the pdf document with some representative text that we generally have in our use case. Please note that it would be a string containing text from multiple charsets in the Unicode (Chinese, Japanese, Korean etc)

input-cjk.pdf (250.9 KB)
ResetSegmentDemo.zip (854 Bytes)

From our tests, it is seen that the string that we have in our use case which translates to a TextFragment, constitutes 40 - 200 TextSegments within.

As mentioned before, we are forced to operate at the level of TextSegments since we are not able to keep the TextSegmentCollection of the TextFragment intact when we do a setText on the TextFragment.

The problem with this is that it 's taking 15 ms per TextSegment to setText to EMPTY STRING. So, with 100 TextSegments on an average per each TextFragment, it would be 100 * 15 ms = 1.5 seconds for each string which is prohibitively slow for our use case which could involve 1000 - 16000 of such strings in a single PDF.

As mentioned in my previous comment, we are looking for

  • An improvement in the performance of the setText method on the TextSegment

AND / OR

  • A new method in the TextFragment API - resetTextSegmentsText() which keeps the TextSegmentCollection of the TextFragment intact while emptying the text of the segments.

to address our concerns with the performance.

Thanks
Aditya


#4

@athota

Thanks for providing the details.

We have logged an enhancement request as PDFJAVA-38627 in our issue tracking system for the sake of implementation of your requirements. We will definitely look into details of the ticket and investigate the feasibility. As soon as we make some progress towards ticket resolution, we will let you know. Please be patient and spare us little time.

We are sorry for the inconvenience.


#5

Thanks for the reply, Asad. Could you please let me know the timeline for this to get addressed ?


#6

@athota

The ticket has been logged under free support model where issues have low priority and are resolved on first come first serve basis. Resolution time of the ticket depends upon how many priority issues are in queue. We will keep you posted in case we make some significant progress towards implementation of requested enhancement. Please spare us little time.

We are sorry for the inconvenience.


#7

Hi Asad,

What are the options for getting this expedited ?
If it is a paid support model, what would be the expected timeline for the addressal. We would at least need a ball park estimate.

We are currently using V18.5.


#8

@athota

Please note that the paid support tickets are definitely resolved sooner than free support tickets. About ETA for PDFJAVA-38627, we have logged your concerns and will update you once any update will be available in this regard.


#9

Thanks Farhan.

We would be very interested in seeing this addressed. We would appreciate if you could provide information regarding the feasibility of the fix after a preliminary investigation by your dev team as soon as you can.

If the answer is a YES for the feasibility, we would look into the next steps for getting it addressed faster.


#10

@athota

Sure, we will definitely let you know about our feedback once investigation of your requirements is completed. We have recorded your concerns in this regard and will definitely consider them during analysis. As soon as we have some definite news about analysis, we will surely share with you in this forum thread. Please spare us little time.