Extracted field values from PDF XFA is not in context of the original form

brissonp · February 19, 2024, 1:47pm

Hi, we are using Aspose total for PDF in java.

When we extract the raw text from an PDF XFA, The text extracted from the XFA is not in context of the form, the field “values” are all grouped together as an array list in the middle of the extracted text. For our usage, this is useless as we need the extracted text in the context of the form. Example:
Form:
Text at the beginning of the form
This is the main person, this is text in the form
Last Name: Smith
First Name: John
this is the text at the end of the form

The extracted text will be:
Text at the beginning of the form
This is the main person, this is text in the form
← The values are not in context anymore, gone from here
this is the text at the end of the form
Form123[0].page][0].person[0].lastname[0]=Smith
Form123[0].page][0].person[0].firtname[0]=John

Thanks

asad.ali · February 19, 2024, 10:24pm

@brissonp

Would you please share your sample PDF document along with the sample code snippet with us? We will test the scenario in our environment and address it accordingly.