Kofax Power PDF allows users to add text on PDF pages. Users use
Typewriter tool for this. My task is to remove all instances of such text from PDF (created by
Typewriter tool) but leave all other text.
What I figured out, it that editor adds this text inside form (/Subtype/Form); this form can be accessed via pdf.Pages[N].Resources[N].Forms[N].
So, theoretically, I need a way to identify forms that were created using
Typewriter tool and remove them. I looked at the PDF file structure and found out that this form is represented by the following object:
<< /Subtype/Form /BBox[ 242.2637 546.4844 612.0137 558.7783] /Matrix[ 1 0 0 1 0 0] /Length 136 /IT/Typewriter /Rotate 0 /Contents(ccccccccccccccc\r) /RC(<?xml version="1.0" ?> <body xmlns="http://www.w3.org/1999/xhtml" xmlns:xfa="http://www.xfa.org/schema/xfa-data/1.0/" xfa:APIVersion="Acrobat:7.0.0" xfa:spec="2.0.2" style = "font-size:12.00pt;font-family:'Times New Roman'"><p><span style="text-decoration:;font-size:9.00pt;font-family:'Avenir Black'">ccccccccccccccc </span></p></body>) /DS(text-decoration:;font-size:12.00pt;font-family:'Times New Roman') /GUID(e0b013a0-4a19-40cf-af-7f16701f2ebb60) /Rect[ 242.2637 546.4844 612.0137 558.7783] /Resources << /ProcSet[/PDF/Text/ImageB/ImageC/ImageI] /Font<</F0 471 0 R>> >> >>stream.....
It seems that I can use
/IT/Typewriter property-value pair to identify such forms, but I do not know how to get access to these properties using Aspose.Pdf.
- Is it possible to get access to low-level object properties mentioned above to identify form for removal?
- Is there alternative way to identify and remove text inserted by
sample for aspose.pdf (6.2 KB)
I tried to remove forms from the page using the latest Aspose.PDF 20.6:
var pdf = new Document(@"..."); pdf.Pages.Resources.Forms.Clear();
but forms are still present in the collection.
Added more details to the initial post.