Reading text and highlighting a text by cordinates

Hi,

I am trying to highlight and get text using x, y coordinates. I can get the highlight working but i cannot seem to retrieve text.

var document = new Aspose.Pdf.Document(pdfStream);
var contentEditor = new Aspose.Pdf.Facades.PdfContentEditor();
contentEditor.BindPdf(document);
contentEditor.CreateMarkup(new System.Drawing.Rectangle(520, 714, 40, 10), “”, 0, 1, System.Drawing.Color.Yellow);
var absorber = new Aspose.Pdf.Text.TextAbsorber();
absorber.TextSearchOptions.LimitToPageBounds = true;
absorber.TextSearchOptions.Rectangle = new Aspose.Pdf.Rectangle(520, 714, 560, 715);
document.Pages[1].Accept(absorber);
string invoiceNumber = absorber.Text;

What am i doing wrong?

Thanks

@nalin.wickramasinghe

Thanks for contacting support.

Would you please share your sample PDF document with us, so that we can test the scenario in our environment and address it accordingly.

invoice_sample.pdf (96.2 KB)

@nalin.wickramasinghe

Thanks for sharing source PDF document.

We were able to replicate the error in our environment while testing the scenario with Aspose.PDF for .NET 18.3. Hence for the sake of correction, we have logged it as PDFNET-44460 in our issue tracking system. We will further investigation the reasons behind this issue and keep you informed with the status of its correction. Please be patient and spare us little time.

We are sorry for the inconvenience.

@nalin.wickramasinghe

We have investigated the issue and found an error in the source code:

new System.Drawing.Rectangle(520, 714, 40, 10) - correct rectangle 40x10 new Aspose.Pdf.Rectangle(520, 714, 560, 715) - incorrect rectangle 40x1 (with height of 1 pt)

We used the following code for testing:

var document = new Aspose.Pdf.Document(dataDir + "invoice_sample.pdf");
var contentEditor = new Aspose.Pdf.Facades.PdfContentEditor();
contentEditor.BindPdf(document);
contentEditor.CreateMarkup(new System.Drawing.Rectangle(520, 714, 40, 10), "", 0, 1, System.Drawing.Color.Yellow);
var absorber = new Aspose.Pdf.Text.TextAbsorber();
absorber.TextSearchOptions.LimitToPageBounds = true;
absorber.TextSearchOptions.Rectangle = new Aspose.Pdf.Rectangle(520, 714, 560, 724);
document.Pages[1].Accept(absorber);
string invoiceNumber = absorber.Text;
Console.WriteLine(absorber.Text);
document.Save(dataDir + "invoice_sample_out.pdf");

It finds the expected ‘13473’ text.