Replacing text in pdf

Hi,
I am using the below code to replace the text in pdf using aspose.pdf.
But it is iterating through only four pages of the pdf .
code
inputPDFFile = pdf.Document(inputpath)
new_file=inputpath.split(’.pdf’)
txtAbsorber = pdf.text.TextFragmentAbsorber(oldtext)
pgs=inputPDFFile.pages
for page in pgs:
page.accept(txtAbsorber)
textFragmentCollection = txtAbsorber.text_fragments
for tex in textFragmentCollection:
tex.text =str(newtext)
inputPDFFile.save(new_file[0]+“updated.pdf”)

can anyone provide inputs on this

@Rakesh99
Thank you for contacting support.
Can you attach your file so that we can test the problem?

@sergey.mikhaylov you mean sample pdf file ?

for every file it is only iterating through only first four pages

@Rakesh99
What type of license do you use when working with Aspose.Pdf?

I don’t have any license.
With out license can’t we use it ?

@Rakesh99
Then most likely this is the problem.
Without an license Aspose.Pdf works in demo mode.
The number of pages and elements processed is limited.

1 Like

Hi @sergey.mikhaylov,
I’m using the demo mode to test the viability of one of my needs, but if there are any restrictions for the aspose demo mode as well after replacing the text in file body, below new text is adding to file, please let me know any restrictions for this as well

Evaluation Only. Created with Aspose.PDF. Copyright 2002-2023 Aspose Pty Ltd.

MicrosoftTeams-image.png (5.5 KB)

@Rakesh99
Yes, the brand appears in the demo version.
There is also a limitation (3-4 pcs) for most of the processed elements - pages, images, fields, etc.

@sergey.mikhaylov
got it and Thanks for your reply

@Rakesh99
If you have any more questions, please contact us.

Hi @sergey.mikhaylov,
I am trying to replace multiple string in a pdf document. is there any other way to do it with out saving the output after replacing?

@Rakesh99

Before saving the output file, you can re-initialize the TextFragmentAbsorber with new text to be replaced and perform further operation.

Hi @asad.ali,
But it is not updating previously replaced text .
And saving only last iterated string .
And I am using the following code

**def replaceText(inputPath, OutputPath, replacedic):
document = ap.Document(inputPath)
for strings in replacedic:
Oldtext= strings

    NewText= replacedic[strings]

    txtAbsorber = ap.text.TextFragmentAbsorber(Oldtext)
    # Search text
    document.pages.accept(txtAbsorber)
    # Get reference to the found text fragments
    textFragmentCollection = txtAbsorber.text_fragments
    # Parse all the searched text fragments and replace text
    for txtFragment in textFragmentCollection:
        try:
            txtFragment.text = NewText
        except:
            print("")
    
  # Save the updated PDF
document.save(OutputPath)**

@Rakesh99

Can you please try calling document.save() without any argument after first replace and see if it preserves the text replacement. If it does not resolve the issue, please share your sample file with us. We will further proceed to assist you accordingly.

@asad.ali
The following code is working fine.


def replaceText(inputPath, OutputPath, replacedic): # Calling replacing function

length=len(replacedic)   #finding a length of replace strings    

document = ap.Document(inputPath)  # reading the input file

for strings in range(length):      #looping through number of strings to be replaced

    # Instantiate a TextFragmentAbsorber object

    txtAbsorber = ap.text.TextFragmentAbsorber(str(list(replacedic.keys())[strings]))

    # Search text

    document.pages.accept(txtAbsorber)

    # Get reference to the found text fragments

    textFragmentCollection = txtAbsorber.text_fragments

    # Parse all the searched text fragments and replace text

    for txtFragment in textFragmentCollection:

        try:

            txtFragment.text = replacedic[txtFragment.text] # replacing old text

        except:

            print("")



# Save the updated PDF

document.save(OutputPath)

@Rakesh99

It is nice to hear that things started working at your end.

@asad.ali, We have taken the license for aspose pdf, when we are accessing the license with below code

import aspose

lic = aspose.License()
lic.set_license(“Aspose.Words.Python.lic”)

throwing the error as AttributeError: module ‘aspose’ has no attribute ‘License’.

Kindly suggest for further to access the license.

@venuc

Can you please make sure that you are using correct module. It looks like you are trying to set license for Aspose.Words module. Please import correct module Aspose.PDF or Aspose.Words and let us know in case issue still persists.