Reading header / footer text

Hi,



I’m using Aspose.Pdf to extract some data from pdf. Some of it is embedded in the headers and footer section.



When traversing Pages collection, Page objects always have Header and Footer properties as null. Granted, the header/footer text appears in the TextFragment collection, but there is no indication on them whether they belong to the header or footer section.



Is there another way to extract header / footer text from the pdf?



Sample pdf attached.



Thanks,

Hi Matt,


Thanks for your inquiry. You may use PdfContentEditor get text of footer/header stamps. For example following code will print text of all stamps on 1st page of the document.


PdfContentEditor pce = new PdfContentEditor();<o:p></o:p>

pce.BindPdf(“input.pdf”);<o:p></o:p>

StampInfo[] infos = pce.GetStamps(1);<o:p></o:p>

foreach (StampInfo si in infos)<o:p></o:p>

{ Console.WriteLine(si.Text); }<o:p></o:p>

<o:p> </o:p>

Note. If you need to get text of header/footer added with Adobe Acrobat (not stamps added by Aspose software) you should use Page.Artifacts property to read header and footer artifacts on the page.


foreach (Artifact artifact in doc.Page[1].Artifacts)<o:p></o:p>

{ if (artifact.Subtype == Artifact.ArtifactSubtype.Header || artifact.Subtype == Artifact.ArtifactSubtype.Footer)<o:p></o:p>

Console.WriteLine(artifact.Text);<o:p></o:p>

}<o:p></o:p>

<o:p> </o:p>

<o:p>Please feel free to contact us for any further assistance.</o:p>

<o:p>
</o:p>

<o:p>Best Regards,</o:p>