Hi,
I wanted to update scanned copy of a PDF file for the “Loan amount” value. But it could fine the text and could not update in the saved file.
// Create TextAbsorber object to find all instances of the input search phrase
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(“3.875”);
// Accept the absorber for first page of document
pdfDocument.getPages().accept(textFragmentAbsorber);
// Get the extracted text fragments into collection
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.getTextFragments();
// Loop through the fragments
for (TextFragment textFragment : (Iterable<TextFragment>) textFragmentCollection) {
// Update text and other properties
System.out.println("found: ="+textFragment.getText());
textFragment.getReplaceOptions().
textFragment.setText("4.875");
}
// Save the updated PDF file
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Assistant: connected
Srimanta: hi
Assistant: Hello, thank you for reaching out to us. May I know your name please?
Srimanta: Srimanta
Assistant: Please provide your email address, this will help our team to reach back.
Srimanta: bidcrm20@gmail.com
Assistant: Thank you for providing your details. You can always type “forget me” to remove these details.
Assistant: Welcome to Aspose, Srimanta!
Assistant: How may I help you today?
Srimanta: I am not able to update text for scanned pdf file
Assistant: Thanks for asking question. I’ll be forwarding your query to our technical support department.
Assistant: Anything else you want to add to your query before I forward it to our support department.
Srimanta: no
Assistant: Srimanta, one last thing before I move forward. Please provide an appropriate subject line for your support ticket.
Srimanta: TextFragment.setText(“12345”) not updating text in scanned pdf file
Assistant: Almost done, generating support ticket for you.
Please try to update the loan amount or interest rate. It not able to update
@srimanta
We have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as PDFJAVA-41063. You will be notified via this forum thread once this issue is resolved.
We apologize for your inconvenience.
Please let me know if this issue got resolved? In that case what is the resolution?
@srimanta
It is to inform you that the issue which you are facing is actually not a bug in Aspose.PDF. So, we have closed this issue (PDFJAVA-41063) as ‘Not a Bug’.
Your PDF document is searchable and it contains scans as images covered with invisible text used special OCR hidden font. Please check SpecialInvisibleFont.png (134.0 KB)
and DocumentStructure.png (115.2 KB).
The text is replaced in the PDF and can be found but visually you still see old text, because it is on image. We recommend you to use RedactionAnnotation
to clear old text from image and add a new text with required values with the following code example.
Document pdfDocument = new Document(MyDir+"Loan Application - Signed.pdf");
// Create TextAbsorber object to find all instances of the input search phrase
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("3.875");
// Accept the absorber for first page of document
pdfDocument.getPages().accept(textFragmentAbsorber);
// Get the extracted text fragments into collection
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.getTextFragments();
// Loop through the fragments
for (TextFragment textFragment : (Iterable<TextFragment>) textFragmentCollection) {
// Update text and other properties
System.out.println("found: ="+textFragment.getText());
Page page = textFragment.getPage();
float fontSize = textFragment.getTextState().getFontSize();
System.out.println("font name: ="+textFragment.getTextState().getFont().getFontName());
Rectangle rectangle = textFragment.getRectangle();
System.out.println(rectangle);
System.out.println(textFragment.getPosition());
Rectangle redactRectangle = new Rectangle(
rectangle.getLLX()-1, rectangle.getLLY()-1,
rectangle.getURX()+1, rectangle.getURY()+1);
RedactionAnnotation annotation = new RedactionAnnotation(page, redactRectangle);
annotation.setFillColor(Color.getWhite());
annotation.setColor(Color.getWhite());
annotation.redactExact();
page.getAnnotations().add(annotation);
page.flatten();//this code will delete text and part of image
//we should add a new textFragment using visible font
TextFragment tf = new TextFragment("4.875");
tf.setRectangle(rectangle);
//2 is correction to reduce baseline difference between fonts.
tf.setPosition(new Position(rectangle.getLLX(),rectangle.getLLY()-2));
tf.getTextState().setFontSize(fontSize);
tf.getTextState().setFontStyle(FontStyles.Bold);
page.getParagraphs().add(tf);
}
// Save the updated PDF file
pdfDocument.save(MyDir+"21.11.pdf");
@tahir.manzoor
Thank you for giving the solution but it won’t work for me
When I run your code, I am getting this exception. Can I get temporary license to test and see if this work for me
Exception in thread “main” class com.aspose.pdf.exceptions.IndexOutOfRangeException: At most 4 elements (for any collection) can be viewed in evaluation mode.
com.aspose.pdf.ADocument.lf(Unknown Source)
com.aspose.pdf.ADocument.lI(Unknown Source)
com.aspose.pdf.XImageCollection.lI(Unknown Source)
com.aspose.pdf.XImageCollection.get_Item(Unknown Source)
com.aspose.pdf.ImagePlacementAbsorber.lf(Unknown Source)
com.aspose.pdf.ImagePlacementAbsorber.lI(Unknown Source)
com.aspose.pdf.ImagePlacementAbsorber.visit(Unknown Source)
com.aspose.pdf.Redaction.lI(Unknown Source)
com.aspose.pdf.RedactionAnnotation.redactExact(Unknown Source)
pdf.pdfdata.LoanApplication.updateLoanApplication(LoanApplication.java:57)
pdf.pdfdata.LoanApplication.main(LoanApplication.java:22)
@srimanta
Please use the latest version of Aspose.PDF for Java 21.11 to avoid this issue.
Tahir, I used the below version but still getting the same error when I try to execute the code you given,
The error is coming from below line
annotation.redactExact();
com.aspose
aspose-pdf
21.11
found: =3.875
font name: =OCR_Hidden
141.65012469314,596.4976977065,163.312743132051,604.817876733824
( 141.65012469314, 596.4976977065 )
Exception in thread “main” class com.aspose.pdf.exceptions.IndexOutOfRangeException: At most 4 elements (for any collection) can be viewed in evaluation mode.
com.aspose.pdf.ADocument.lf(Unknown Source)
com.aspose.pdf.ADocument.lI(Unknown Source)
com.aspose.pdf.XImageCollection.lI(Unknown Source)
com.aspose.pdf.XImageCollection.get_Item(Unknown Source)
com.aspose.pdf.ImagePlacementAbsorber.lf(Unknown Source)
com.aspose.pdf.ImagePlacementAbsorber.lI(Unknown Source)
com.aspose.pdf.ImagePlacementAbsorber.visit(Unknown Source)
com.aspose.pdf.Redaction.lI(Unknown Source)
com.aspose.pdf.RedactionAnnotation.redactExact(Unknown Source)
@srimanta
You are using Aspose.PDF in evaluation mode. Please get 30 days temporary license and apply it before importing PDF into Aspose.PDF DOM. Please get the temporary license from here:
Get a Temporary License
Hi,
I need a temporary license to evaluate aspose PDF. I created an order ID #211129155940 . Please send me the license.
Thanks
Srimanta
@srimanta
The license is sent to your email after creating an order for temporary license. Please check your emails. Sometimes emails may land in your Junk/Spam folder therefore please make sure to check these folders as well.