I’m using Aspose.OCR 23.9.0.0 in a .NET Framework 4.8 project. After I do the OCR conversion of the PDF containing an image, the search (Control + F) in Adobe Acrobat finds the text for which I’m searching but it does not highlight the correct text. It’s off by a few words in some cases. Also, if I copy and paste text from the PDF, the text is not what I selected in the PDF. Please see the attached files for an example. What could cause this behavior?
Code snippet:
var asposeOcr = new Aspose.OCR.AsposeOcr();
OcrInput input = new Aspose.OCR.OcrInput(Aspose.OCR.InputType.PDF);
input.Add(serverFilePath);
var settings = new Aspose.OCR.RecognitionSettings();
settings.Language = Aspose.OCR.Language.Eng;
settings.DetectAreasMode = DetectAreasMode.DOCUMENT;
List<Aspose.OCR.RecognitionResult> results = asposeOcr.Recognize(input, settings);
AsposeOcr.SaveMultipageDocument(serverFilePath, Aspose.OCR.SaveFormat.Pdf, results);
HighlightIssue.jpg (61.3 KB)
OCR.pdf (588.6 KB)
source.pdf (26.0 KB)