I get the error as "Exception of type 'System.OutOfMemoryException' was thrown." when i try to process the pdf of size 9 MB.
Also I tested with smaller pdf [147 kb], it works for it.
Code is as below: [For simple testing i have created the regular expression to match dd/dd/dd string pattern in pdf]
//Instantiate PdfExtractor object
PdfExtractor extractor = new PdfExtractor();
//Set Password for input PDF file
extractor.Password = "";
//Bind the input PDF document to extractor
extractor.BindPdf("C:\\pdftest\\WebApplication1\\" + "Test.pdf");
//Extract text from the input PDF document
extractor.ExtractText();
string path = "C:\\pdftest\\WebApplication1\\" + "Test.txt";
FileStream f = new FileStream(path, FileMode.Create);
extractor.GetText(f);
f.Seek(0, 0);
StreamReader reader = new StreamReader(f);
string mainReportText = reader.ReadToEnd();
mainReportText = mainReportText.Trim();
f.Close();
string pattern = @"((\d{2})/(\d{2})/(\d{2}))";
//[-+]((0[0-9]|1[0-3]):([03]0|45)|14:00)
//@"^(\d{2}/)(\d{2}/)(\d{2}/)$";
//@"^((4\d{3})|(5[1-5]\d{2})|(6011))-?\d{4}-?\d{4}-?\d{4}|3[4,7][\d\s-]{15}$";
Regex match = new Regex(pattern);
//return match.IsMatch(num);
Match m = match.Match(mainReportText);
Response.Write(m.Value);
//Save the extracted text to a text file
//extractor.GetText("C:\\pdftest\\WebApplication1\\" + "prod_eob.txt");
PdfContentEditor editor = new PdfContentEditor();
editor.BindPdf("C:\\pdftest\\WebApplication1\\" + "Test.pdf");
editor.ReplaceText(m.Value, "xx/xx/xx");
editor.Save("C:\\pdftest\\WebApplication1\\" + "replace2.pdf");
Also, the below code does not work. It gives error that "file .pdf is being used by another process. This is because i am trying to replace multiple texts in a for loop.
PdfContentEditor editor = new PdfContentEditor();
editor.BindPdf("C:\\pdftest\\WebApplication1\\" + "new1.pdf");
foreach (Match mat in match3.Matches(mainReportText))
{
editor.ReplaceText(mat.Value.Substring(mat.Value.IndexOf('-') + 1, 4), "XXXX");
}
But if i replace a single text as below it works:
PdfContentEditor editor = new PdfContentEditor();
editor.BindPdf("C:\\pdftest\\WebApplication1\\" + "new1.pdf");
Match mat = match3.Match(mainReportText);
editor.ReplaceText(mat.Value.Substring(mat.Value.IndexOf('-') + 1, 4), "XXXX");