Editing an existing PDF

Hello,

I am trying to open up existing PDF files, and based off of the header, either leave the page alone, or extract it to a separate file. I have been reviewing some of the samples of code and everything, and as far as I can tell, I would have to extract every page from the original PDF, then extract the text from every page. So then I can check the actual data in the new txt file, then I can know if I had to keep the page in the original file, or extract it into a separate file. By linking up file names and concatenating the separated pages together to reform the whole pdf.

Is this the only way I can do this? Or is there an easier way of performing this task? Such as some how being able to read text one line at a time or something?

Thanks!

Hi

Thank you for considering Aspose.

I think following code will ease your job.

//Extracts each page's text into one txt file
PdfExtractor extractor = new PdfExtractor();
extractor.BindPdf(TestPath + @"Aspose.Pdf.Kit.Pdf");
extractor.ExtractText();
String prefix = TestPath + @"Aspose.Pdf.Kit";
String suffix = ".txt";
int pageCount = 1;
while (extractor.HasNextPageText())
{
extractor.GetNextPageText(prefix + pageCount + suffix);
pageCount++;
}

Thanks