How to read the labels of the editable fields in a PDF?

nw.pdf (470.6 KB)
Hi, please notice the editable fields in the attached PDF. I need your advice on how to “extract” the text in the labels of each of the editable fields.

For example the labels I am referring to are - Name, Policy Number, Street Address, City, State, SSN, Phone, Email, Joint Owner’s Name etc.

  • Can you please guide me with a code excerpt in getting this data from the PDF?

Thanks, Sugath

1 Like

nw.pdf (470.6 KB)
output.JPG (45.1 KB)

Hi, output.jpg contains the output of the below piece of code.
String imagePath = “c:/tmp/nw.pdf”;
Document pdfDocument = new Document(imagePath);
for (Field fd: pdfDocument.getForm().getFields())
{
System.out.println("Value - " + fd.getValue());
System.out.println("Name - " + fd.getName());
System.out.println("AlternateName - " + fd.getAlternateName());
System.out.println(“FullName - " + fd.getFullName());
System.out.println(“PartialName - " + fd.getPartialName());
System.out.println(”*************”);
}

Please notice that nw.pdf is also attached to this ticket. And, as you can notice, for some reason - there are a bunch of data fields that were skipped by this code - Name, Phone, SSN, Email .

Can you please check and let me know how to read these missing data fields as well?

@meZocliqllc

Please check the below console output that we got while testing the scenario with Aspose.PDF for Java 21.1:

Value - policy # 1234
Name - null
AlternateName - Contract Number
FullName - OWNER.CONTRACT_NUMBER
PartialName - CONTRACT_NUMBER
*************
Value - 5 Peregrine Dr
Name - null
AlternateName - Address
FullName - OWNER.ADDRESS1
PartialName - ADDRESS1
*************
Value - Somerset
Name - null
AlternateName - City
FullName - OWNER.CITY
PartialName - CITY
*************
Value - NJ
Name - null
AlternateName - State
FullName - OWNER.STATE
PartialName - STATE
*************
Value - 08873
Name - null
AlternateName - Zip
FullName - OWNER.ZIP
PartialName - ZIP
*************
Value - (908) 211-2121
Name - null
AlternateName - Phone
FullName - OWNER.PHONE
PartialName - PHONE
*************
Value - abc@yahoo.com
Name - null
AlternateName - Email
FullName - OWNER.EMAIL
PartialName - EMAIL
*************
Value - Owners Name
Name - null
AlternateName - Owner Full Name
FullName - OWNER.FULL_NAME
PartialName - FULL_NAME
*************
Value - 121345678
Name - null
AlternateName - Owner SSN
FullName - OWNER.SSN
PartialName - SSN
*************
Value - Joint Owners Name
Name - null
AlternateName - Full Name
FullName - INSURED.FULL_NAME
PartialName - FULL_NAME
*************
Value - null
Name - null
AlternateName - Policy Loan Options
FullName - POLICY_LOAN_OPTIONS
PartialName - POLICY_LOAN_OPTIONS
*************
Value - null
Name - null
AlternateName - Policy Loan Options
FullName - MAX_LOAN
PartialName - MAX_LOAN
*************
Value - null
Name - null
AlternateName - Policy Loan Options
FullName - INDEXED_UNIVERSAL_LIFE_POLICIES
PartialName - INDEXED_UNIVERSAL_LIFE_POLICIES
*************
Value - null
Name - null
AlternateName - Partial Surrender Options
FullName - PARTIAL_SURRENDER_OPTIONS
PartialName - PARTIAL_SURRENDER_OPTIONS
*************
Value - null
Name - null
AlternateName - Patial Withdrawals
FullName - LOAN
PartialName - LOAN
*************
Value - null
Name - null
AlternateName - Partial Surrender
FullName - PARTIAL_SURRENDER
PartialName - PARTIAL_SURRENDER
*************
Value - null
Name - null
AlternateName - Policy Loan Options
FullName - PARTIAL_SURRENDER2
PartialName - PARTIAL_SURRENDER2
*************
Value - null
Name - null
AlternateName - Dividend Withdrawal
FullName - DIVIDEND_WITHDRAWAL1
PartialName - DIVIDEND_WITHDRAWAL1
*************
Value - null
Name - null
AlternateName - Dividend Withdrawal
FullName - DIVIDEND_WITHDRAWAL
PartialName - DIVIDEND_WITHDRAWAL
*************
Value - null
Name - null
AlternateName - Dividend Withdrawal
FullName - PAIDUP_ADDITIONS
PartialName - PAIDUP_ADDITIONS
*************
Value - null
Name - null
AlternateName - Partial Surrender
FullName - PARTIAL_SURRENDER1
PartialName - PARTIAL_SURRENDER1
*************
Value - null
Name - null
AlternateName - Partial Surrender
FullName - PAIDUP_ADDITIONS_DOLAR
PartialName - PAIDUP_ADDITIONS_DOLAR
*************
Value - null
Name - null
AlternateName - Surrender of Accumulated Dividends
FullName - ACCUMULATED_DIVIDENDS
PartialName - ACCUMULATED_DIVIDENDS
*************
Value - null
Name - null
AlternateName - Partial Surrender
FullName - ACCUMULATED_DIVIDENDS_DOLAR
PartialName - ACCUMULATED_DIVIDENDS_DOLAR
*************
Value - null
Name - null
AlternateName - Owner Signature Date
FullName - OWNER_SIGNATURE.DATE
PartialName - DATE
*************
Value - null
Name - null
AlternateName - Joint Owner Full Name
FullName - JOINT.FULL_NAME
PartialName - FULL_NAME
*************
Value - null
Name - null
AlternateName - Joint Signature Date
FullName - JOINT_SIGNATURE.DATE
PartialName - DATE
*************
Value - null
Name - null
AlternateName - Other
FullName - SIGNATURE_OTHER
PartialName - SIGNATURE_OTHER
*************
Value - null
Name - null
AlternateName - Dollar
FullName - FEDERAL_TAXES_PERCENT
PartialName - FEDERAL_TAXES_PERCENT
*************
Value - null
Name - null
AlternateName - Check to Owner
FullName - CHECK1
PartialName - CHECK1
*************
Value - null
Name - null
AlternateName - Direct Deposit
FullName - PAYMENT_DD
PartialName - PAYMENT_DD
*************
Value - null
Name - null
AlternateName - Name on Account
FullName - DD_ACCOUNT_NAME
PartialName - DD_ACCOUNT_NAME
*************
Value - null
Name - null
AlternateName - Financial Institution
FullName - DD_FINANCIAL_INSTITUTION
PartialName - DD_FINANCIAL_INSTITUTION
*************
Value - null
Name - null
AlternateName - Account Type
FullName - DD_ACCOUNT_TYPE
PartialName - DD_ACCOUNT_TYPE
*************
Value - null
Name - null
AlternateName - Transit/ABA routing Number
FullName - DD_ABA_NUMBER
PartialName - DD_ABA_NUMBER
*************
Value - null
Name - null
AlternateName - Account Number
FullName - DD_ACCOUNT_NUMBER
PartialName - DD_ACCOUNT_NUMBER
*************
Value - null
Name - null
AlternateName - Check to Owner
FullName - CHECK_OWNER
PartialName - CHECK_OWNER
*************
Value - null
Name - null
AlternateName - Check to Alternate Payee
FullName - CHECK_ALTERNATE_PAYEE
PartialName - CHECK_ALTERNATE_PAYEE
*************
Value - null
Name - null
AlternateName - Address
FullName - PAYABLE_ADDRESS
PartialName - PAYABLE_ADDRESS
*************
Value - null
Name - null
AlternateName - City
FullName - PAYABLE_CITY
PartialName - PAYABLE_CITY
*************
Value - null
Name - null
AlternateName - State
FullName - PAYABLE_STATE
PartialName - PAYABLE_STATE
*************
Value - null
Name - null
AlternateName - Zip
FullName - PAYABLE_ZIP
PartialName - PAYABLE_ZIP
*************
Value - null
Name - null
AlternateName - Other Title
FullName - SIG_OTHER.TITLE
PartialName - TITLE
*************
Value - null
Name - null
AlternateName - Other SSN
FullName - SIG_OTHER.SSN
PartialName - SSN
*************
Value - null
Name - null
AlternateName - Other Full Name
FullName - SIG_OTHER.FULL_NAME
PartialName - FULL_NAME
*************
Value - null
Name - null
AlternateName - Other Signature Date
FullName - SIG_OTHER_SIGNATURE.DATE
PartialName - DATE
*************

Also, would you please make sure that you are using a valid license before extracting the information from the PDF. Please let us know if you notice any missing field information in the shared output.

Can you please give me a code sample to read the labels as I mentioned in my ticket?

I was using the evaluation license while testing the above code excerpt. I will try with the actual license and check if it solves the issue. Thank you.

Also, the ticket How to read the labels of the editable fields in a PDF? is different from the other ticket (Unable to read data in some fields in the pdf) and I would like to request you to let me know this information as well. Please give me a code excerpt, where I can read the labels as I explained in the ticket.

@meZocliqllc

It is nice to hear that your issue has been resolved.

By labels, do you mean the text which is present before the text boxes? For example:

First Name: <<Text Box Here>>

Do you need to extract “First Name:”?

Exactly. That is exactly what I am looking for. I would like to read that label and also, if possible, associate that label with the textbox? Can you let me know how to, for the given pdf?

@meZocliqllc

In PDF format, such plain text is not associated with the form fields. In other words, API cannot perform a custom search to extract respective label (which is only a plain text) against a form field. We are afraid that your this requirement may not be possible to achieve.

Do you want to associate that label with text box in your code only (just for the purpose of further processing)? OR do you want to save that association as well within the PDF document structure?

I want to associate the label with the textbox in the code only for further processing.

@meZocliqllc

The only possible way would be to search the text according to Alternate Name of a field inside PDF document. Once the result is returned, you can map the obtained text fragment with the respective form field in your code part for further processing. Please let us know in case you need further assistance.