Hello, I am using the latest version of Aspose.PDF (19.4) but this issue has persisted since v16.11. For some reason certain PDF pages cause us to get this unhandled exception when calling either textFragmentAbsorber.Visit(page)
or page.Accept(textFragmentAbsorber)
:
Index and length must refer to a location within the string.
Parameter name: length
This exception is thrown by System.String.Substring
–here is the full stack trace from textFragmentAbsorber:
at System.String.Substring(Int32 startIndex, Int32 length)
at #=zyTB3XIyaBhej7MWu47CesmD$bfxg8HcchasVZf6ESKJX.#=zGt6m6yc97uXkcK1_Qw==(String #=zqJN9UNs=, #=zqF0Ar1gRucV_ #=ztMoGLK9J0q3b)
at #=zyTB3XIyaBhej7MWu47CesmD$bfxg8HcchasVZf6ESKJX.#=zZ$33QmbInEBLvABe7g==(String #=zqJN9UNs=, #=zqF0Ar1gRucV_ #=ztMoGLK9J0q3b)
at #=z41PQwlAXVF3C4xYRZatHFeTEH05ktVgHDtu8FONmp2teaBq5N9x8Ikc=.#=zSWCKZaoaiOR_(#=zOj0VlyKypTqDkT5xWZloVk7qovZYAlVrSw== #=zyCLMEEHDQfl1, Boolean #=zeUufYJyB1UihD2ZUxA==)
at #=zxxOyLC020CcmMkhiy4tXZ1svD7kXoIcfVQ==.#=z6YWdw7I=(Boolean #=zjFq8Ob86k9k_ObVCOg==)
at #=zxxOyLC020CcmMkhiy4tXZ1svD7kXoIcfVQ==.#=zMfeFBRpvOo5w()
at #=zGuSEinf51GwaWgcLur1eBvogNseW53R1TDBzD1VUTV4dC$C$B0c_POE=.#=zuYcchDc=(#=zOj0VlyKypTqDkT5xWZloVk7qovZYAlVrSw== #=zoWfxCrV07EJ4)
at #=zoZbZ78XusXM_nzRHEWgXZYTHHedWRiBO5OMTxn6RjiBhxC8LOw==.#=zsZXBrC4tFWq9()
at #=zoZbZ78XusXM_nzRHEWgXZYTHHedWRiBO5OMTxn6RjiBhxC8LOw==..ctor(String #=zhTjiml0=, #=zfAqJJT0= #=z18f6JNs=, #=zfAqJJT0= #=z8$PsPVc0jD4V, #=zXu7BGZHPDNo6EkVq_lJdCo3wTL0hR1HTHIkLvRkDPLUl5Liu9tavhZI= #=zaRram4rt_ceE, #=zAJI72aOK1q$q6K3CAz0khNXmbz4GUqJdh0pZ574= #=zyt0hNPM=, #=zxO40XsL557cnUOLZuO0QYsrw1$jEogyRtroXecQjEgG8RtTxnPTGgak= #=z_nZRQU9BiHeC23L8Dg==, Double #=z3Z6nl1M=, Double #=z9onouws=, #=ztYWYwnALNhuNwST_2qf1gcU5JhqGT8NnHTOOirpzjinNof8Zfg== #=zE8gff7M=, #=zyv7qNJaFLOxmHUayweA$tUL5$dL5b0nKdH2vW0LTFwlxBEh0Icw_bPU= #=z2I0hUWU=)
at #=zGuSEinf51GwaWgcLur1eBvogNseW53R1TDBzD1VUTV4dC$C$B0c_POE=.#=zVBRCJ4Pq$wtL(String #=zhTjiml0=, #=zfAqJJT0= #=z18f6JNs=, #=zfAqJJT0= #=z8$PsPVc0jD4V, #=zXu7BGZHPDNo6EkVq_lJdCo3wTL0hR1HTHIkLvRkDPLUl5Liu9tavhZI= #=zaRram4rt_ceE, #=zAJI72aOK1q$q6K3CAz0khNXmbz4GUqJdh0pZ574= #=zyt0hNPM=, #=zxO40XsL557cnUOLZuO0QYsrw1$jEogyRtroXecQjEgG8RtTxnPTGgak= #=z_nZRQU9BiHeC23L8Dg==, Double #=z3Z6nl1M=, Double #=z9onouws=, #=ztYWYwnALNhuNwST_2qf1gcU5JhqGT8NnHTOOirpzjinNof8Zfg== #=zE8gff7M=, #=zyv7qNJaFLOxmHUayweA$tUL5$dL5b0nKdH2vW0LTFwlxBEh0Icw_bPU= #=z2I0hUWU=)
at #=zXu7BGZHPDNo6EkVq_lJdCo3wTL0hR1HTHIkLvRkDPLUl5Liu9tavhZI=.#=zWetla8P8NV4v(Int32 #=zCP5d0H4=, Int32 #=zLsgGattZzP6t, Operator #=zbqeNkAI=, #=ztYWYwnALNhuNwST_2qf1gcU5JhqGT8NnHTOOirpzjinNof8Zfg== #=zE8gff7M=)
at #=zXu7BGZHPDNo6EkVq_lJdCo3wTL0hR1HTHIkLvRkDPLUl5Liu9tavhZI=.#=zL1psiRd2GpC5(#=zfAqJJT0= #=z18f6JNs=)
at #=zXu7BGZHPDNo6EkVq_lJdCo3wTL0hR1HTHIkLvRkDPLUl5Liu9tavhZI=.#=z0DnGUR0=(Int32 #=zCP5d0H4=, Operator #=zbqeNkAI=)
at #=zXu7BGZHPDNo6EkVq_lJdCo3wTL0hR1HTHIkLvRkDPLUl5Liu9tavhZI=.#=z8QtsWa0=()
at #=zNmZ11JZG2s4gwNGgO6ST5Cc6wzPZL$7XQLF32TMcIIa$$C41HPn2Sab4IcZ1.#=zwFPLRFD42p5v(BaseOperatorCollection #=zBqFbvow=, Resources #=zyt0hNPM=, Page #=zraOqAJ0=)
at #=zNmZ11JZG2s4gwNGgO6ST5Cc6wzPZL$7XQLF32TMcIIa$$C41HPn2Sab4IcZ1.#=zwFPLRFD42p5v(BaseOperatorCollection #=zBqFbvow=, Resources #=zyt0hNPM=)
at #=zNmZ11JZG2s4gwNGgO6ST5Cc6wzPZL$7XQLF32TMcIIa$$C41HPn2Sab4IcZ1.#=zwaAhBC0=()
at #=zNmZ11JZG2s4gwNGgO6ST5Cc6wzPZL$7XQLF32TMcIIa$$C41HPn2Sab4IcZ1..ctor(Page #=zraOqAJ0=, TextSearchOptions #=zr40I$rv9oWYL, Boolean #=zfY4x0s3Snhxe)
at #=zNmZ11JZG2s4gwNGgO6ST5Cc6wzPZL$7XQLF32TMcIIa$$C41HPn2Sab4IcZ1..ctor(Page #=zraOqAJ0=, TextSearchOptions #=zr40I$rv9oWYL)
at Aspose.Pdf.Text.TextFragmentAbsorber.Visit(Page page)
I would be happy to share an example PDF that has this issue but it is confidential so I don’t want to share it publicly on the forum.
Thanks,
Jonny