Highlight and List All the Highlighted Words in PowerPoint Presentations in C#

Hi Team,

How can we highlight text in ppt using Regex along with add comments. Need to prepare summary also with information like words found on which page and how many times.
Following are the raw inputs:

No option to attach ppt file so sharing content in .docx file.

Ppt-Slides-Sample-Text.docx (13.9 KB)

Code Sample:

public List<HighlightedWordsClass> HighlightWordsInPpt(string licensePath, string pptPath)
{
    List<HighlightedWordsClass> objReturn = new List<HighlightedWordsClass>();
    License license = new License();
    license.SetLicense(licensePath);
    Presentation presentation = new Presentation(pptPath);

    List<WordListClass> listOfWords = new List<WordListClass>();
    listOfWords.Add(new WordListClass("Lorem", "Lorem is the word and this is comment for it.", true, true));
    listOfWords.Add(new WordListClass("porttitor", "porttitor is the word and this is comment for it.", false, true));
    listOfWords.Add(new WordListClass("volutpat", "volutpat is the word and this is comment for it.", false, true));
    listOfWords.Add(new WordListClass("felis", "felis is the word and this is comment for it.", false, true));

    return objReturn;
}


public class WordListClass
{
    public WordListClass(string word, string comment, bool isCase, bool isMatch)
    {
        Word = word;
        Comments = comment;
        isCase = IsCaseSensitive;
        IsMatchWholeWorld = isMatch;
    }
    public string Word { get; set; }
    public string Comments { get; set; }
    public bool IsCaseSensitive { get; set; }
    public bool IsMatchWholeWorld { get; set; }
}
public class HighlightedWordsClass
{
    public string Word { get; set; }
    public List<FindingClass> findings { get; set; }
}
public class FindingClass
{
    public int PageNumber { get; set; }
    public int Occurences { get; set; }
}

Thank you in Advance!

@Jaibir,
Thank you for posting your requirements.

With Aspose.Slides for .NET, you can highlight text using regular expressions, but unfortunately, I have not found a way to list the highlighted words.

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): SLIDESNET-44473

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

1 Like

Hi @andrey.potapov

Instead of highlighting text using for each slide loop. Do we have option to highlight in 1 go in all slides.
Like we have following code for highlight in word:

Doc.Range.Replace(regexString, "$0", option1);

And following code for highlight in pdf:

TextFragmentAbsorber tfa = new TextFragmentAbsorber(regexSt, textSearchOptions);
doc.Pages.Accept(tfa);

Do we have similar method for .ppt files?

Thanks in Advance!

@Jaibir,
Thank you for the details. I’ve forwarded them to our developers.

1 Like

Hi @andrey.potapov
Any update on the ticket :
SLIDESNET-44473

@Jaibir,
As far as I can see, our developers are working on the issue. Unfortunately, I don’t have any additional information yet.

@Jaibir,
New interfaces and methods will be added to Aspose.Slides for .NET 24.6:

/// <summary>
/// Represents options which can be used to find text in Presentation, Slide or TextFrame.
/// </summary>
public interface ITextSearchOptions
{
        /// <summary>
        /// Set true to use case-sensitive search, false - otherwise.
        /// </summary>
        bool CaseSensitive { get; set; }

        /// <summary>
        /// Set true to match only whole words, false - otherwise.
        /// </summary>
        bool WholeWordsOnly { get; set; }
}

and

/// <summary>
///  Callback interface used to getting search text result.
/// </summary>
public interface IFindResultCallback
{
        /// <summary>
        /// Callback method which receives data about found text.
        /// </summary>
        void FoundResult(ITextFrame textFrame, string sourceText, string foundText, int textPosition);
}

Below is an example code that solves the problem of getting information about all cases of text highlighting:

using (Presentation pres = new Presentation(pptxFileName))
{
    // Create callback.
    FindResultCallback callback = new FindResultCallback();

    // Create search options.
    TextSearchOptions options = new TextSearchOptions() { ResultCallback = callback };

    // Highlight all words "secteftuer". 
    pres.HighlightText("secteftuer", Color.Yellow, options);

    // Output the number of found fragments of the given text. 
    Console.WriteLine(callback.Count);

    // Output data for each word "secteftuer" found. 
    foreach (WordInfo info in callback.Words)
    {
        Console.WriteLine(info.FoundText, info.TextPosition, info.Context);
    }

    // Get all the data about the found cases in the first slide. 
    WordInfo[] elements = callback.GetElemensForSlide(callback.SlideNumbers[1]);

    // Output the number of found fragments of the given text in the first slide.
    Console.WriteLine(elements.Length);
}

/// <summary>
/// Class that provides information about all found occurrences of a given text.
/// </summary>
private class FindResultCallback : IFindResultCallback
{
    // Array of retrieved text information.
    internal List<WordInfo> Words = new List<WordInfo>();

    /// <summary>
    /// The number of matches found to a given text.
    /// </summary>
    public int Count
    {
        get { return Words.Count; }
    }

    /// <summary>
    /// Gets all slides in which the given text was found.
    /// </summary>
    public int[] SlideNumbers
    {
        get
        {
            List<int> slides = new List<int>();
            foreach (var word in Words)
            {
                if (!slides.Contains(((Slide)word.TextFrame.Slide).SlideNumber))
                    slides.Add(((Slide)word.TextFrame.Slide).SlideNumber);
            }

            return slides.ToArray();
        }
    }

    /// <summary>
    /// Gets all occurrences of the found text on the slide.
    /// </summary>
    /// <param name="slideNumber">Slide number</param>
    public WordInfo[] GetElemensForSlide(int slideNumber)
    {
        List<WordInfo> foundElements = new List<WordInfo>();
        foreach (var element in Words)
            if (((Slide)element.TextFrame.Slide).SlideNumber == slideNumber)
                foundElements.Add(element);

        return foundElements.ToArray();
    }

    /// <summary>
    /// Callback method which receives data about found text.
    /// </summary>
    /// <param name="textFrame"><see cref="ITextFrame"/> where serching text was found.</param>
    /// <param name="sourceText">Source text of TextFrame where text was found.</param>
    /// <param name="foundText">Found text.</param>
    /// <param name="textPosition">Position of found text in source text.</param>
    public void FoundResult(ITextFrame textFrame, string oldText, string foundText, int textPosition)
    {
        Words.Add(new WordInfo(textFrame, oldText, foundText, textPosition));
    }
}

/// <summary>
/// A class that provides information about each given text found in the presentation.
/// </summary>
private class WordInfo
{
    /// <summary>
    /// Constructor.
    /// </summary>
    internal WordInfo(ITextFrame textFrame, string sourctText, string foundText, int textPosition)
    {
        TextFrame = textFrame;
        SourceText = sourctText;
        FoundText = foundText;
        TextPosition = textPosition;
    }

    /// <summary>
    /// Gets found text.
    /// </summary>
    public string FoundText { get; set; }

    /// <summary>
    /// Gets the source text for the TextFrame in which the text was found.
    /// </summary>
    public string SourceText { get; set; }

    /// <summary>
    /// Position of the found text in the text frame.
    /// </summary>
    public int TextPosition { get; set; }

    /// <summary>
    /// The text frame in which the text was found.
    /// </summary>
    public ITextFrame TextFrame { get; set; }

    /// <summary>
    /// The context in which the text was found.
    /// </summary>
    public string Context
    {
        get
        {
            int startIndex = TextPosition - 3 < 0 ? 0 : TextPosition - 3;
            int len = TextPosition + FoundText.Length + 20 > SourceText.Length ? SourceText.Length - TextPosition : FoundText.Length + 20;
            return (startIndex == 0 ? "" : "...") + SourceText.Substring(startIndex, len) + (startIndex + len < SourceText.Length ? "..." : "");
        }
    }
}

Also, new methods HighlightText and HighlightRegex will be added to the IPresentation interface that allow you to perform operations for the whole presentation:

/// <summary>
/// Highlight all matches of sample in text frame text using specified color.
/// </summary>
void HighlightText(string text, Color highlightColor);

/// <summary>
/// Highlight all matches of sample text in presentation using specified color.
/// </summary>
void HighlightText(string text, Color highlightColor, ITextSearchOptions options, IFindResultCallback callback);

/// <summary>
/// Highlight all matches of regular expression in presentation using specified color.
/// </summary>
void HighlightRegex(Regex regex, Color highlightColor, IFindResultCallback callback);

The issues you found earlier (filed as SLIDESNET-44473) have been fixed in Aspose.Slides for .NET 24.6 (ZIP, MSI, NuGet, Cross-platform).
You can check all fixes on the Release Notes page.
You can also find the latest version of our library on the Product Download page.