Hello,
I am trying to open up existing PDF files, and based off of the header, either leave the page alone, or extract it to a separate file. I have been reviewing some of the samples of code and everything, and as far as I can tell, I would have to extract every page from the original PDF, then extract the text from every page. So then I can check the actual data in the new txt file, then I can know if I had to keep the page in the original file, or extract it into a separate file. By linking up file names and concatenating the separated pages together to reform the whole pdf.
Is this the only way I can do this? Or is there an easier way of performing this task? Such as some how being able to read text one line at a time or something?
Thanks!