Aspose.OCR: extract Text from nonsearchable PDF's not working properly

Hello support team

We want to convert “non-searchable PDF’s” into “searchable PDF’s” with the help of Aspose.OCR (see Sample_Doc.pdf in the attachment).

In our solution we get a black content after the conversion with Aspose.OCR and the content of the PDF is completely destroyed and no longer usable for the customer. (See not_wanted_result.pdf in the attachment)

I have checked the same PDF with Aspose Cloud solution at “OCR Online. Convert PDF to Searchable PDF” and get only a part as OCR (see Aspose_Cloud_result.jpg in the attachment).

I have tested the same PDF with the mentioned sample under “GitHub - aspose-ocr/Aspose.OCR-for-.NET: Aspose.OCR for .NET examples, plugins and showcase projects” as an application and get no text at all, although the conversion is successful and shows no error (see Aspose_App_Result.jpg in the attachment).

Are you aware of such a case?
What do we have to do so that if it cannot read and convert the content, we catch the error or empty content and do not pass on a corrupt document and simply leave the original as it is.

Customer_Attachments.zip (1.0 MB)

Thanks in advance for a possible solution or answer
Best regards

@hasanirmak

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): OCRNET-827

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Hello Support Team
Thank you for opening the ticket.
Has anything changed in the meantime or have you been able to reproduce the problem?

Best Regards

@hasanirmak

The issue has been resolved while using below code with 24.4 version. Results are also attached:

OcrInput input = new OcrInput(InputType.PDF, filter);
string imgPath = @"D:\imgs\ISSUES\NET827\Sample_Doc.pdf";
input.Add(imgPath);
List<RecognitionResult> result = api.Recognize(input, new RecognitionSettings
{
  DetectAreasMode = DetectAreasMode.TABLE
});

result.zip (610.4 KB)