I have been having problems with PDF files containing unreadable pages witch in turn causes problems later down my pipeline.
Are there anyway to check if a page in a pdf is corrupted/unreadable?
Something along the lines of
foreach(Page page in pdfDoc.pages){
page.IsReadable();
}
@maria_burns_stralfor
Thanks for contacting support.
It is quite possible that your PDF file (i.e which is unreadable) contains images, which is why the text cannot be extracted from PDF using the API. However, you may check whether the PDF file contains image or text, by using code snippet in following article of API documentation.
In case your scenario is different and you still face issue while reading the PDF file, please share your sample source PDF document with us, so that we can test the scenario in our environment and address it accordingly.