We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Extract Footers/Headers from pdf file

Hi,


I can’t find in your API option to extract footers and headers from pdf file.
I have seen it was developed: PDF NEWNET-31086
but i have not seen any example how it can be done.
I am using Aspose.Pdf for .net version 9.5.0.0

Thanks.
Alex B

Hi Alex,


Thanks for your inquiry. Please note PdfContentEditor was improved to get text of footer/header stamps. Text property was added to StampInfo. For example following code will print text of all stamps on 1st page of the document.


PdfContentEditor pce = new PdfContentEditor();<o:p></o:p>

pce.BindPdf(“input.pdf”);<o:p></o:p>

StampInfo[] infos =
pce.GetStamps(1);<o:p></o:p>

foreach (StampInfo si in infos)<o:p></o:p>

{ Console.WriteLine(si.Text);
}<o:p></o:p>

Note. If you need to get text of header/footer added with Adobe Acrobat (not stamps added by Aspose software) you should use Page.Artifacts property to read header and footer artifacts on the page.

foreach (Artifact artifact in doc.Page[1].Artifacts)

{ if (artifact.Subtype == Artifact.ArtifactSubtype.Header || artifact.Subtype == Artifact.ArtifactSubtype.Footer)

Console.WriteLine(artifact.Text);

}

Please feel free to contact us for any further assistance.


Best Regards,

Hi,


Thank you for your answer. I need get text from header and footer from different pdf files which were created in some way that i really don’t know it can also be from different application.
I have tried your code on pdf file which was created by word and i get NullreferenceException then i try get:
pdfDocument.Pages[1].Artifacts
my code :
Document pdfDocument = new Document(@“doc2.pdf”);
        <span style="color:blue;">foreach</span> (<span style="color:#2b91af;">Artifact</span> artifact <span style="color:blue;">in</span> pdfDocument.Pages[1].Artifacts)
        {
            <span style="color:blue;">if</span> (artifact.Subtype == <span style="color:#2b91af;">Artifact</span>.<span style="color:#2b91af;">ArtifactSubtype</span>.Header || artifact.Subtype == <span style="color:#2b91af;">Artifact</span>.<span style="color:#2b91af;">ArtifactSubtype</span>.Footer)
            {
                <span style="color:#2b91af;">Console</span>.WriteLine(artifact.Text);
            }
        }</pre><pre style="font-family: Consolas; background-image: initial; background-attachment: initial; background-size: initial; background-origin: initial; background-clip: initial; background-position: initial; background-repeat: initial;">I have attached link to tested file: </pre><p class="MsoPlainText"><a href="https://nfil.es/rV4sfA/">https://nfil.es/rV4sfA/</a><o:p></o:p></p>
.
Thanks,
Alex B

Hi Alex,


I
have tested the scenario and I am able to reproduce the same problem. For the
sake of correction, I have logged it in our issue tracking system as PDFNEWNET-37367. We will
investigate this issue in details and will keep you updated on the status of a
correction. <o:p></o:p>

We apologize for your inconvenience.

Hi,

I have tried others documents the same code not failed but i can't manage to get text from headers and footers.

Any advice?

Files for test: https://nfil.es/lHSa9d/

Regards,

Alex B

abarmak:
I have tried others documents the same code not failed but i can't manage to get text from headers and footers.

Any advice?

Files for test: https://nfil.es/lHSa9d/

Hi Alex,


Thanks for sharing the details.


I have tested the scenario and have managed to reproduce the same issue that Text is not being extracted from Header/Footer section of earlier shared PDF files. For the sake of correction, I have separately logged it in our issue tracking system as PDFNEWNET-37371. We will investigate this issue in details and will keep you updated on the status of a correction.

We apologize for your inconvenience.

Hi Alex,


Thanks for your patience. We have investigated the PDFNEWNET-37371 issue and found that your document contains artifacts but they are not of type “Header”. Artifact have non standard type “Pagination”. You can check this type with “CustomType” property as following.

if (artifact.CustomType == “Pagination”)<o:p></o:p>

{ Console.WriteLine(artifact.Text); }

Please feel free to contact us for any further assistance.


Best Regards,

The issues you have found earlier (filed as PDFNEWNET-37367;PDFNEWNET-37371) have been fixed in Aspose.Pdf for .NET 9.7.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.