Hi Rajeev,
Thanks for your inquiry. You may use following snippet to check whether PDF document has only images (scanned PDF). Also please pay attention that we’ve supplied the most simple way of defining image only PDFs. Proposed code snippet uses show text operator to deduce that it is image only PDF. In general there can be other rules of detecting image only PDFs and that can be defined using DOM (i.e. by analyzing pages content).
boolean HasOnlyImages(String filename)<o:p></o:p>
{
Document doc = new Document(filename);
OperatorSelector os;
for (int pageCount = 1;
pageCount <= doc.getPages().size(); pageCount++)
{
Page page=
doc.getPages().get_Item(pageCount);
os = new OperatorSelector(new
Operator.ShowText());
page.getContents().accept(os);
if (os.getSelected().size() != 0)
return false;
}
return true;
}
Please feel free to contact us for any further assistance.
Best Regards,