Is it possible to extract text by coordinates in a PDF using OCR a bit similar to Extracting text inside a rectangle|Documentation but then for a pdf instead of an image as source.
We need to investigate these requirements. Can you please share a sample image and your expected output results from it? We will log an investigation ticket and share the ID with you.
Meanwhile, below is the code snippet that can be used to achieve your requirements:
OcrInput input = new OcrInput(InputType.PDF);
input.Add(imgPath);
var result = api.Recognize(input, new RecognitionSettings
{
RecognitionAreas = new List<Aspose.Drawing.Rectangle>
{
new Aspose.Drawing.Rectangle(10, 10, 200, 500)
}
});
Hi,
It a not a particular PDF but just in general if it was possible to extract text from a pdf for a certain rectangle region. Your answer above is enough, we were able to extract text from the document.
Thanks.
It is nice to know that you were able to extract the text. In case you need further assistance, please feel free to create a new topic.