We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Search occurrence of a text from pages and highlight them and covert into single html pages

when i tried to search a particular text in a pdf ,the first 4 text , similar to the search text is highlighted from every page if present and the reset same text are not highlighted.So i have used below code ,please help me to figure out how to highlight all the text present in a pdf.
And also when i use text segment to get text,when i convert i to html the text is not displayed.

        Document doc = new Document("s1.pdf");
        //input string

        string c = "a";
        //adding /s* for space or line break to get exact match
        string formattedLine = Regex.Replace(c, @"\s*", " ").Replace(" ",@"\s*");

        HtmlSaveOptions htmlOptions = new HtmlSaveOptions();
        htmlOptions.PartsEmbeddingMode = Aspose.Pdf.HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
        htmlOptions.LettersPositioningMethod = Aspose.Pdf.HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
        htmlOptions.SplitCssIntoPages = false;
        htmlOptions.RasterImagesSavingMode = Aspose.Pdf.HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
        htmlOptions.RemoveEmptyAreasOnTopAndBottom = true;
        htmlOptions.FontSavingMode = Aspose.Pdf.HtmlSaveOptions.FontSavingModes.SaveInAllFormats;
        foreach (Page page in doc.Pages)
            TextFragmentAbsorber tfa = new TextFragmentAbsorber("(?i)"+formattedLine, new TextSearchOptions(true));
            TextFragmentCollection tfc = tfa.TextFragments;
            if (tfc.Count > 0)
                int j = tfc.Count;
                    foreach (TextFragment frag in tfc)

                        tfa.TextSearchOptions.IsRegularExpressionUsed = true;
                        frag.TextState.ForegroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Black);
                        frag.TextState.BackgroundColor = Aspose.Pdf.Color.FromRgb(System.Drawing.Color.Yellow);


                    Document newDocument = new Document();
                    newDocument.Save( page.Number + ".html", htmlOptions);

                Console.WriteLine("Not Found");




Would you please make sure that you are using a valid license or free 30-days temporary license while highlighting the text? In case you are still facing the issue, please share your sample source PDF file with us. We will test the scenario in our environment and address it accordingly.

the text is highlighted but the problem is ,when i used textsegment to highlight the text and after converting to html ,the text which is highlighted is disappeared ,only the color is seen.So what is the cause of these problem.


We need to investigate the issue at our end in order to determine the cause behind this behavior of the API. Could you please share a sample PDF document for our reference so that we can test the scenario in our environment and address it accordingly?

s1.pdf (356.9 KB)
so what i want is to search text in a pdf and highlight it and then convert the page into single html pages for which the search text is present. when i used text segment to add color annoation in html the text is not visbile only the highlighted color is present.


We used Aspose.PDF for .NET 21.10 version and did not notice any issue. Please check the attached output HTML files generated by your code snippet in our environment:

htmlfiles.zip (3.5 MB)

Please try to use the latest version of the API and let us know in case you still face any issues.