Replace text in pdf using python

harshitha112000 · November 22, 2023, 2:44pm

Hi, I want to replace the text in pdf with empty space. The code is executing but the output pdf is corrupted. Can anyone please provide exact code to replace text in pdf using python

asad.ali · November 22, 2023, 8:04pm

@harshitha112000

Can you please share the sample code snippet and that sample PDF for our reference? We will test the scenario in our environment and address it accordingly.

harshitha112000 · November 23, 2023, 10:03am

import aspose.pdf as ap

# Load the PDF document
document = ap.Document("out1.pdf")
print(new_page1_text11)
# Instantiate a TextFragmentAbsorber object
txtAbsorber = ap.text.TextFragmentAbsorber("Name of the medicinal product")

# Search text
document.pages.accept(txtAbsorber)

# Get reference to the found text fragments
textFragmentCollection = txtAbsorber.text_fragments

# Parse all the searched text fragments and replace text
for txtFragment in textFragmentCollection:
    txtFragment.text = ""

# Save the updated PDF
document.save("output.pdf")

out1.pdf (117.9 KB)

I want to replace everything before second appearance of “Name of the medicinal product” to empty space

asad.ali · November 23, 2023, 7:11pm

@harshitha112000

We need to investigate this case in details. Therefore, have logged a ticket as PDFPYTHON-163 in our issue tracking system. We will further look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.