Why my OCR so bad

jon_elster_i3intel_com · August 15, 2020, 4:05pm

Trying to OCR a PDF that was converted to PNG at 600!
The OCR is picking up garbage
I can read the PNG fine!

Returning this text

signed To: 764 Tme Assigned: 17.16 ETA: Aved: 17: 42 Edr 3 EAT! s
sigd tot is sini E E
neuuuneu uuJse(vuuupueuv 1vfei/lnuunuyjuue 1 Mfue$
ein
Motificatinns
Superdisor Safety Mardia
Structure Fire:
CO:
Other:
(Note: Provlde name and time notlfled)

asad.ali · August 16, 2020, 7:29pm

@jon_elster_i3intel_com

Would you kindly share your sample image with us. You can attach it with your post using the upload button. We will test the scenario in our environment and address it accordingly.

jon_elster_i3intel_com · August 16, 2020, 9:37pm

I TRIED THE FREE tesseractdotnet
Google Code Archive - Long-term storage for Google Code Project Hosting.

And it worked great!

Aspose could not
Any ideas?

Debug Capture.PNG (594.1 KB)

Here’s and original image9_out.jpg (1.1 MB)

I tired the same with

jon_elster_i3intel_com · August 17, 2020, 5:23pm

Hi
Any updates?
Tesseract was able to read all the text.
thanks again

asad.ali · August 17, 2020, 6:30pm

@jon_elster_i3intel_com

We have logged an issue as OCRNET-251 in our issue tracking system for incorrect text extraction from the image. We will investigate it in details and keep you posted with the status of its rectification. Please be patient and spare us some time.

We are sorry for your inconvenience.