Highlight Text on pdf files: format image over text

Good Afternoon. I'm using your api to highlight text. I'm having some problems when the pdf format is image over text. The text is in a hidden layer, (or i think so). I can find text. But when y change backgorund i cant see the text.I send you two code snippets that i have tried, and a pdf files that allows you to view the result:

----------------------------------------------------------------------------------------------------------

Protected Sub HighLightPDF(ByVal strBusqueda As String, _
ByVal pathFicheroOriginal As String, _
Byref pathFicheroSalida As String)
'open document
Dim pdfDocument As New Document(pathFicheroOriginal)
'create TextAbsorber object to find all instances of the input search phrase
Dim textFragmentAbsorber As New Aspose.Pdf.Text.TextFragmentAbsorber(strBusqueda)
'accept the absorber for all the pages
pdfDocument.Pages.Accept(textFragmentAbsorber)
'get the extracted text fragments
Dim textFragmentCollection As Aspose.Pdf.Text.TextFragmentCollection = textFragmentAbsorber.TextFragments
'loop through the fragments
For Each textFragment As Aspose.Pdf.Text.TextFragment In textFragmentCollection
Dim textStamp As New TextStamp(strBusqueda)
'set whether stamp is background
textStamp.Background = False
textStamp.Opacity = 100
'set origin
textStamp.XIndent = textFragment.Position.XIndent
textStamp.YIndent = textFragment.Position.YIndent
pdfDocument.Pages(1).Background = Drawing.Color.Transparent
textStamp.TextState.Font = textFragment.TextState.Font
textStamp.TextState.FontSize = textFragment.TextState.FontSize
textStamp.TextState.FontStyle = FontStyles.Bold
textStamp.TextState.ForegroundColor = System.Drawing.Color.Black
textStamp.TextState.BackgroundColor = Drawing.Color.Yellow
pdfDocument.Pages(1).AddStamp(textStamp)
Next textFragment
pdfDocument.Save(pathFicheroSalida)
End Sub

Second try: i can add a stamp over the image, but size doesnt fit.

----------------------------------------------------------------------------------------------

Protected Sub HighLightPDF(ByVal strBusqueda As String, _
ByVal pathFicheroOriginal As String, _
ByVal pathFicheroSalida As String)
'open document
Dim pdfDocument As New Document(pathFicheroOriginal)
'create TextAbsorber object to find all instances of the input search phrase
Dim textFragmentAbsorber As New Aspose.Pdf.Text.TextFragmentAbsorber(strBusqueda)
'accept the absorber for all the pages
pdfDocument.Pages.Accept(textFragmentAbsorber)
'get the extracted text fragments
Dim textFragmentCollection As Aspose.Pdf.Text.TextFragmentCollection = textFragmentAbsorber.TextFragments
'loop through the fragments
For Each textFragment As Aspose.Pdf.Text.TextFragment In textFragmentCollection
Dim textStamp As New TextStamp(strBusqueda)
'set whether stamp is background
textStamp.Background = False
textStamp.Opacity = 100
'set origin
textStamp.XIndent = textFragment.Position.XIndent
textStamp.YIndent = textFragment.Position.YIndent
pdfDocument.Pages(1).Background = Drawing.Color.Transparent
textStamp.TextState.Font = textFragment.TextState.Font
textStamp.TextState.FontSize = textFragment.TextState.FontSize
textStamp.TextState.FontStyle = FontStyles.Bold
textStamp.TextState.ForegroundColor = System.Drawing.Color.Black
textStamp.TextState.BackgroundColor = Drawing.Color.Yellow
pdfDocument.Pages(1).AddStamp(textStamp)
Next textFragment
pdfDocument.Save(pathFicheroSalida)
End Sub

Hi Alex,

Thank you for considering Aspose.Pdf.

I tried to test your issue but I was unable to check it due to insufficient information provided in your post. Please share which text phrase you are trying to search and which field is the hidden field which is causing the problem (as I checked your PDF file and I was unable to determine which field you are talking about which causes problem).

Sorry for the inconvenience.

I'm trying to search ALBARAN word.

With my first code i get a yellow square without text.

with the second code i put a stamp over the image, but font doesnt fit.

Thanks a lot

Hi Alex,

Sorry for a delayed response.

I have tested your scenario with the sample code and template file you shared and I am able to notice the highlighting problem you have mentioned. Your issue has been registered in our issue tracking system with issue id: PDFNEWNET-33686. You will be notified via this forum thread regarding any updates against your reported issue.

Sorry for the inconvenience

P.S. There is only one sample code / scenario shared by you. You have copied the same code twice in your post.

The issues you have found earlier (filed as PDFNEWNET-33686) have been fixed in Aspose.Pdf for .NET 7.4.0.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Sorry i made a mistake. That code was my try to get this work. The code that i can't make work is this: (I get a blue square but without text... I think its there but i cant see it thanks a lot)

Protected Sub GenerarNuevoPDF(ByVal strBusqueda As String, _
ByVal pathFicheroOriginal As String, _
ByRef pathFicheroSalida As String)
Dim contentEditor As New Facades.PdfContentEditor()
contentEditor.BindPdf(pathFicheroOriginal)
'contentEditor.TextEditOptios = New PdfContentEditor.TextProperties("Courier", True, True)
Dim text As New Aspose.Pdf.Text.TextState(System.Drawing.Color.White)
text.BackgroundColor = System.Drawing.Color.DarkBlue ' System.Drawing.Color.FromArgb(120, 255, 255, 0)
contentEditor.ReplaceTextStrategy.ReplaceScope = Facades.ReplaceTextStrategy.Scope.REPLACE_ALL
contentEditor.ReplaceText(strBusqueda, strBusqueda, text)
contentEditor.ReplaceText(strBusqueda.ToUpper(), strBusqueda.ToUpper(), text)
contentEditor.Save(pathFicheroSalida)
contentEditor.Close()
End Sub

Hi Alex,

Thank you for the sample code and template file.
I am able to reproduce your mentioned issue after an initial test. Your issue has been registered in our issue
tracking system with issue id: PDFNEWNET-34368 for our development team for further investigation. We will inform you via this forum thread regarding any updates.

Sorry for the inconvenience,

@ahidalgo

Thanks for your patience.

We have investigated the issue PDFNET-34368 and found that source document came from scanning and optical text recognition (OCR). It contains original image and invisible text.
Even so text is invisible you can use TextFragmentAbsorber to find it. But changes of invisible text appearance (foreground color) will not be rendered until the text becomes visible. Earlier making text visible required complicated working with operators. But TextState.Invisible property has became available with DOM approach.

Please consider the following code snippet:

// open document
Document pdfDocument = new Document(myDir + "zz.pdf");

TextFragmentAbsorber absorber = new TextFragmentAbsorber("ALBARAN");
pdfDocument.Pages.Accept(absorber);
foreach (TextFragment fragment in absorber.TextFragments)
{
    fragment.TextState.Invisible = false;
    fragment.TextState.BackgroundColor = Aspose.Pdf.Color.DarkBlue;
    fragment.TextState.ForegroundColor = Aspose.Pdf.Color.White;
}

pdfDocument.Save(myDir + "34368_fragment_invisibility&color_changed.pdf");

Please try using the latest release version Aspose.Pdf for .NET 17.12 and in case you face any issue, please feel free to contact us.