Check if PDF page is readable

maria_burns_stralfor · December 19, 2017, 10:34am

I have been having problems with PDF files containing unreadable pages witch in turn causes problems later down my pipeline.

Are there anyway to check if a page in a pdf is corrupted/unreadable?

Something along the lines of

foreach(Page page in pdfDoc.pages){

    page.IsReadable();

}

asad.ali · December 19, 2017, 5:50pm

@maria_burns_stralfor

Thanks for contacting support.

It is quite possible that your PDF file (i.e which is unreadable) contains images, which is why the text cannot be extracted from PDF using the API. However, you may check whether the PDF file contains image or text, by using code snippet in following article of API documentation.

Find whether PDF file contains images or text only

In case your scenario is different and you still face issue while reading the PDF file, please share your sample source PDF document with us, so that we can test the scenario in our environment and address it accordingly.