Please suggest a way to improve the accuracy of extracted text from the attached document
The pre-processing filters did not help.a.jpg (696.9 KB)Scan1.jpg (173.6 KB)
Please suggest a way to improve the accuracy of extracted text from the attached document
The pre-processing filters did not help.a.jpg (696.9 KB)Scan1.jpg (173.6 KB)
We could not find any attached file with your post. Would you please share it so that we can test the scenario in our environment and address it accordingly.
Please find the attachment now. I did not have permissions to attach files previously and I did not realize that. Thank You
We were able to notice that API returned garbage values while performing OCR over the images you shared. We have logged an issue as OCR-774 in our issue tracking system. We will further look into details of it and keep you posted with the status of its correction. Please spare us little time.
We are sorry for the inconvenience.
With Aspose.Ocr 23.7.0, we have got the better results
The code we have used:
OcrInput input = new OcrInput(InputType.SingleImage);
input.Add(@"a.jpg");
var result = api.Recognize(input, new RecognitionSettings
{
DetectAreasMode = DetectAreasMode.PHOTO
});
Console.WriteLine(result[0].RecognitionText);
AsposeOcr.SaveMultipageDocument("D://a.txt", SaveFormat.Text, result);
results.zip (1.1 KB)
@asad.ali Hello, may I ask how to accurately search for image information using Java? If you need more information, please let me know. thanks
Input File 2.jpg (142.7 KB)
This is the code I am using
AsposeOCR api = new AsposeOCR();
RecognitionSettings recognitionSettings = new RecognitionSettings();
recognitionSettings.setLinesFiltration(true);
recognitionSettings.setAllowedCharacters(CharactersAllowedType.ALL);
recognitionSettings.setLanguage(Language.Chi);
RecognitionResult result = api.RecognizePage("D:\\java\\work\\test\\src\\main\\resources\\2.png", recognitionSettings);
System.out.println("Recognition result:\n" + result.recognitionText + "\n\n");
Can you please explain a bit more about your question? Are you unable to extract all text from this image? What type of image information do you need to get using the API?
Unable to obtain all information in the image
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): OCRJAVA-333
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
Unfortunately, we can’t recognize numbers on this image. Only chinese characters.
RecognitionSettings set = new RecognitionSettings();
set.setLanguage(Language.Chi);
set.setDetectAreasMode(DetectAreasMode.PHOTO);
It’s our text detector. It catchs numbers, but
Even if we rotate image - we can’t recognize them.
18T
(
副
型
警
/
。
l
322
.32
厂
"
9(
Q
90
:一一
忍
/
一
一
1
)
旦
i
出
含
"T
目
影
士
乡
密
樊
渺
)
H
l
单件
由由
岩
生
世P
a
排
—。
Y
目鹏
;
亡
夕
鸟
憾耸
肯
"
■
门S
叶
但又台
3
乡x
早
沙
32
]乡
32
80
山微乡
8棉
/
做
仍
4
C
1
2km
y
地名
国道
公路
河
流
6
%入
心)
~公
口
一
平
爸跳
完钻井
完钻井
设计井
设计井
期)
(上部)
(加密)
(加密)
(上部)
187[50
We will plan to add ability to recognize mixed vertical and horizontal text on the image. But this particular image seems to be very hard to recognition, and we think we will not get good result in any case.