Replacing text in pdf

Rakesh99 · June 22, 2023, 6:06am

Hi,
I am using the below code to replace the text in pdf using aspose.pdf.
But it is iterating through only four pages of the pdf .
code
inputPDFFile = pdf.Document(inputpath)
new_file=inputpath.split(’.pdf’)
txtAbsorber = pdf.text.TextFragmentAbsorber(oldtext)
pgs=inputPDFFile.pages
for page in pgs:
page.accept(txtAbsorber)
textFragmentCollection = txtAbsorber.text_fragments
for tex in textFragmentCollection:
tex.text =str(newtext)
inputPDFFile.save(new_file[0]+“updated.pdf”)

can anyone provide inputs on this

sergey.mikhaylov · June 22, 2023, 7:10am

@Rakesh99
Thank you for contacting support.
Can you attach your file so that we can test the problem?

Rakesh99 · June 22, 2023, 7:11am

@sergey.mikhaylov you mean sample pdf file ?

Rakesh99 · June 22, 2023, 7:12am

for every file it is only iterating through only first four pages

sergey.mikhaylov · June 22, 2023, 8:20am

@Rakesh99
What type of license do you use when working with Aspose.Pdf?

Rakesh99 · June 22, 2023, 10:50am

I don’t have any license.
With out license can’t we use it ?

sergey.mikhaylov · June 23, 2023, 8:08am

@Rakesh99
Then most likely this is the problem.
Without an license Aspose.Pdf works in demo mode.
The number of pages and elements processed is limited.

Rakesh99 · July 19, 2023, 7:31am

Hi @sergey.mikhaylov,
I’m using the demo mode to test the viability of one of my needs, but if there are any restrictions for the aspose demo mode as well after replacing the text in file body, below new text is adding to file, please let me know any restrictions for this as well

MicrosoftTeams-image.png (5.5 KB)

sergey.mikhaylov · July 20, 2023, 6:41am

@Rakesh99
Yes, the brand appears in the demo version.
There is also a limitation (3-4 pcs) for most of the processed elements - pages, images, fields, etc.

Rakesh99 · July 21, 2023, 9:10am

@sergey.mikhaylov
got it and Thanks for your reply

sergey.mikhaylov · July 21, 2023, 9:46pm

@Rakesh99
If you have any more questions, please contact us.

Rakesh99 · August 21, 2023, 6:22am

Hi @sergey.mikhaylov,
I am trying to replace multiple string in a pdf document. is there any other way to do it with out saving the output after replacing?

asad.ali · August 21, 2023, 3:54pm

@Rakesh99

Before saving the output file, you can re-initialize the TextFragmentAbsorber with new text to be replaced and perform further operation.

Rakesh99 · August 21, 2023, 4:19pm

Hi @asad.ali,
But it is not updating previously replaced text .
And saving only last iterated string .
And I am using the following code

**def replaceText(inputPath, OutputPath, replacedic):
document = ap.Document(inputPath)
for strings in replacedic:
Oldtext= strings

    NewText= replacedic[strings]

    txtAbsorber = ap.text.TextFragmentAbsorber(Oldtext)
    # Search text
    document.pages.accept(txtAbsorber)
    # Get reference to the found text fragments
    textFragmentCollection = txtAbsorber.text_fragments
    # Parse all the searched text fragments and replace text
    for txtFragment in textFragmentCollection:
        try:
            txtFragment.text = NewText
        except:
            print("")
    
  # Save the updated PDF
document.save(OutputPath)**

asad.ali · August 21, 2023, 8:38pm

@Rakesh99

Can you please try calling document.save() without any argument after first replace and see if it preserves the text replacement. If it does not resolve the issue, please share your sample file with us. We will further proceed to assist you accordingly.

Rakesh99 · August 22, 2023, 7:14am

@asad.ali
The following code is working fine.

def replaceText(inputPath, OutputPath, replacedic): # Calling replacing function

length=len(replacedic)   #finding a length of replace strings    

document = ap.Document(inputPath)  # reading the input file

for strings in range(length):      #looping through number of strings to be replaced

    # Instantiate a TextFragmentAbsorber object

    txtAbsorber = ap.text.TextFragmentAbsorber(str(list(replacedic.keys())[strings]))

    # Search text

    document.pages.accept(txtAbsorber)

    # Get reference to the found text fragments

    textFragmentCollection = txtAbsorber.text_fragments

    # Parse all the searched text fragments and replace text

    for txtFragment in textFragmentCollection:

        try:

            txtFragment.text = replacedic[txtFragment.text] # replacing old text

        except:

            print("")



# Save the updated PDF

document.save(OutputPath)

asad.ali · August 22, 2023, 12:04pm

@Rakesh99

It is nice to hear that things started working at your end.

venuc · August 24, 2023, 9:42am

@asad.ali, We have taken the license for aspose pdf, when we are accessing the license with below code

import aspose

lic = aspose.License()
lic.set_license(“Aspose.Words.Python.lic”)

throwing the error as AttributeError: module ‘aspose’ has no attribute ‘License’.

Kindly suggest for further to access the license.

asad.ali · August 24, 2023, 5:33pm

@venuc

Can you please make sure that you are using correct module. It looks like you are trying to set license for Aspose.Words module. Please import correct module Aspose.PDF or Aspose.Words and let us know in case issue still persists.