Not able to read text from the all images present in the pdf

abdulkadirsabirbohari · August 21, 2023, 2:01pm

AsposeOcr extractTextFromImage = new AsposeOcr();
string textExtractedFromImage = extractTextFromImage.RecognizeImageFast(imagePath);

using above code to extract the text from images.

Error in Console:

[ErrorCode:RuntimeException] Non-zero status code returned while running ConvInteger node. Name:‘Conv_0_quant’ Status Message: bad allocation

asad.ali · August 21, 2023, 8:45pm

@abdulkadirsabirbohari

Is it happening with all images you are processing? Can you please share sample image for our reference? We will test the scenario in our environment and address it accordingly.

abdulkadirsabirbohari · August 22, 2023, 4:57am

testImageExtration1x.pdf (301.9 KB)

shares the pdf , from where I want to extract text from pdf and and images in pdf.

asad.ali · August 22, 2023, 11:40am

@abdulkadirsabirbohari

We are afraid that currently Aspose.OCR does not support extracting text from a PDF that has mixed content i.e. images and text. However, an analysis ticket OCRNET-701 in our issue issue tracking has already been logged to investigate this feasibility. The ticket has been attached with this forum thread so that you will receive notification once it is resolved. Please be patient and spare us some time.

We are sorry for the inconvenience.

abdulkadirsabirbohari · August 24, 2023, 12:50pm

I used pdf extracter to extract images from pdf
pdfExtractor = new PdfExtractor(pdfDocument)

and then image by image extract text , and append in string builder

pdfExtractor.GetNextImage(imagePath);

AsposeOcr extractTextFromImage = new AsposeOcr();
string textExtractedFromImage = extractTextFromImage.RecognizeImageFast(imagePath);

do you have any solution for the particular error, please look into this asap.

asad.ali · August 25, 2023, 12:23am

@abdulkadirsabirbohari

Can you please share which particular error are you facing? Is it throwing some kind of exception?