Hi,
I use java pdf development kit 1.9 of evaluation version to extract text from my pdf documents,most of pdf can be extract well,but some meet can't read and can't stop issues.
This is my sample code to using pdf development kit:
PdfExtractor extractor = new PdfExtractor();
ByteArrayOutputStream out=new ByteArrayOutputStream();
extractor.bindPdf(source);
extractor.setStartPage(1);
extractor.setEndPage(6);
extractor.extractText();
extractor.getText(out);
String originalcontent=out.toString().trim();
content=originalcontent.substring(179, countnum);
content=content.replaceAll("\r\n", "");
content=content.replaceAll("\n", "");
System.out.println(content);
most pdf can extract to text well,but when extracting some pdf,it can still a long time and the program can't terminate ,so i set a breakpoint and found the program will holding on the sentence "extractor.extractText();" all the time,attached file the one pdf document which meet this problem.
Thank you !