TextState Invisible is set to true for Text in the Document

Hi,

I am trying to create a gotor Link for a document, even though the selected area has texts , It is still giving TextState.Invisible as True.

Sample Code:
Aspose.Pdf.License license = new Aspose.Pdf.License();
license.SetLicense(“Aspose.Pdf.lic”);
Aspose.Pdf.Document targetDoc = null;
targetDoc = new Aspose.Pdf.Document($"{PartialPath} + “sublocade-rems-document-clean (2).pdf”);
page = targetDoc.Pages[1];
double llx = double.Parse(42.8738);
double lly = page.MediaBox.Height - double.Parse(178.456) - double.Parse(48) - 2;
double urx = llx + double.Parse(77.3759) + 2;
double ury = lly + double.Parse(23.161) + 2;
AnnotRect = new Aspose.Pdf.Rectangle(llx, lly, urx, ury);
TextFragmentAbsorber absorber = new TextFragmentAbsorber();
absorber.TextSearchOptions.Rectangle = seeRect;
page.Accept(absorber);
bool VisibleText = false;
for (int j = 1; j <= absorber.TextFragments.Count; j++)
{
TextFragment textFragment = absorber.TextFragments[j];
if (textFragment.TextState.Invisible == false && textFragment.Text.Trim().Length > 0)
{
VisibleText = true;
break;
}
}
LinkAnnotation link = new LinkAnnotation(page, AnnotRect);
if (VisibleText)
{
Border border = new Border(link);
border.Width = 0;
link.Border = border;
}
else
{
Border border = new Border(link);
System.Drawing.Color color1 = System.Drawing.ColorTranslator.FromHtml(obj.rectcolor);
link.Color = Aspose.Pdf.Color.FromRgb(color1);
border.Style = BorderStyle.Solid;
link.Border = border;
}

XYZExplicitDestination mag = new XYZExplicitDestination(page, -32768, -32768, 1);
link.Action = new GoToRemoteAction($"{PartialPath}_link.pdf", mag);

(link.Action as GoToRemoteAction).NewWindow = Aspose.Pdf.ExtendedBoolean.True;

page.Annotations.Add(link);
targetDoc.Save($"{PartialPath} + “sublocade-rems-document-clean (2).pdf”);

For the document sublocade-rems-document-clean (2).pdf (156.6 KB), for some text we are getting textFragment.TextState.Invisible value as True, because of that our logic is not working here. Please see the screenshot image-2023-06-20-16-40-30-267.png (275.2 KB)

Please help us to resolve this.

@prnksheela

Can you please share the value of seeRect variable in your above code snippet?

You can replace seeRect with AnnotRect

@prnksheela

Can you please share which API version are you using? We tested your code snippet with 23.8 version of the API and noticed that no text was extracted or found by the API by given coordinates:

In above code, the text fragments count is 0.

We are using 22.8 version. But in the document that is text right(Please refer the screenshot added), why aspose TextAbsorber taking it as non text.

@prnksheela

We are afraid that we cannot provide support against such older version of the API. There may be some issues in the older version that it is finding the text against wrong coordinates. The latest version does not find any text given the same coordinates. We request you please try using 23.8 version and share your feedback in case you face any issues.

I have tried with Aspose.PDF for .NET 23.9 version, I am facing the same issue.

Here I am not understanding even though the selected co-ordinates has text still it is taking as non text. Could you please tell us why aspose considering these as non text for the places in screenshot highlighted which I have attached previously.

@prnksheela

Have you used similar coordinates with the similar PDF that you earlier shared with us in your first post?

Yes, I tried the same coordinates and there is a slight change in formula as below,

double llx = 42.8738;
double lly = page.MediaBox.Height - 178.456 - 23.161 - 2;
double urx = 42.8738 + 77.3759 + 2;
double ury = 178.456 + 23.161 + 2;

23.161 instead of this I have given 48, sorry for the inconvenience.

also the below coordinates are giving the same issue

llx = 42.8738
lly = 182.019
height = 19.5985
width = 82.7197
borderWidth = 2

llx = 212.305
lly = 33.111
height = 13.8011
width = 274.401
borderWidth = 2

Please apply this value in the below formula,

double llx = llx;
double lly = page.MediaBox.Height - (lly) - (height) - borderWidth;
double urx = llx + (width) + borderWidth;
double ury = lly + (height) + borderWidth;

and pass it as rectangle value.

@prnksheela

Using the above values, we were able to find the text in the PDF and link has also been added using 23.9 version of the API. Please check the attached sample PDFs generated in our enviornment.

sublocade-rems-document-clean (2).pdf (155.2 KB)
addedhiglightusingsamecoords.pdf (155.8 KB)

Could you please try and add link in 3rd page of the PDF, as you see in the screenshot that attached.

page = targetDoc.Pages[3];

image-2023-06-20-16-40-30-267.png (275.2 KB)

@prnksheela

Attached is the output that we obtained when added link on the 3rd page of this PDF. sublocade-rems-document-clean (2)_out.pdf (154.9 KB)

Hi, I am not able to find any links in the output file. Could you please check whether the link is created in 3rd page where I am facing the issue.

@prnksheela

Please check the attached screenshot where the link was created in the output PDF shared earlier. image.png (129.8 KB)

Hi,

I could see that the selected region different from the image that shared

Could you please try the rectangle - {36.7904,589.6775,124.0198,613.698}

below logic not needed for this rectangle value,

double llx = 42.8738;
double lly = page.MediaBox.Height - 178.456 - 23.161 - 2;
double urx = 42.8738 + 77.3759 + 2;
double ury = 178.456 + 23.161 + 2;

@prnksheela

With these values, can you please share the screenshot of the PDF where do you want to render the link?

image.png (118.1 KB)

Please find the attached image with marked up place for the dimension mentioned in above thread.

@prnksheela

Please check the attached output PDF file that we obtained after using the new values that you shared. A screenshot of the added link is also attached which is similar to your expected output:

image.png (24.0 KB)

sublocade-rems-document-clean (2)_out.pdf (154.9 KB)

What is the value for textFragment.TextState.Invisible for this rectangle?

is it true or false?

I am getting true.

But there is text and value of textFragment.TextState.Invisible should be false.

Why is it giving true?

@prnksheela

Can you please share which API version are you using?