Find and Replace Strikethrough and Underline

Hello,

Our company is working on a project where we need to read a word document and detect the following:

  1. Find all text fragments in paragraphs that contain a strikethrough
  2. Find all text fragments in paragraphs that contain an underline

When the text within a paragraph contains a strikethrough, we need to remove the text completely.
When the text within a paragraph contains an underline, we just need to remove the underline and keep the text.

This will then be saved in another word document. Here is example of what we are trying to achieve. The original document will look something like this. Note, there is no option for strikethrough or underline in this editor, so I used metatags instead to indicate both [Strikethrough] and [Underline].

(1) [Strikethough] Applications for licensure must meet the prerequisites [Strikethough] for and pass the Foundations of Oriental Medicine, Acupuncture with Point Location, and Biomedicine examinations required for certification in acupuncture

(2) [Underline] Applicants for licensure must pass the examination in clean needle technique administered by the Council of Colleges for Acupuncture and Oriental Herbal Medicine, or its successor. [Underline]

The end result would look something like this:

(1) for and pass the Foundations of Oriental Medicine, Acupuncture with Point Location, and Biomedicine examinations required for certification in acupuncture

(2) Applicants for licensure must pass the examination in clean needle technique administered by the Council of Colleges for Acupuncture and Oriental Herbal Medicine, or its successor.

I tried doing this with the following code, but I am not sure if there is a more sophisticated way to detect both styles or font effects and apply the changes as needed. Any help would be appreciated:

Document doc = new Document("Strikethrough.docx");
Document cloneDoc = (Document)doc.deepClone(true);
doc.joinRunsWithSameFormatting();

for (Run run : (Iterable<Run>)doc.getChildNodes(NodeType.RUN, true))
{

    if (run.getFont().getUnderline() == Underline.SINGLE)
    {
        System.out.println("UNDERLINE =>" + run.getText());
        FindReplaceOptions options = new FindReplaceOptions();
        options.getApplyFont().setUnderline(Underline.NONE);
        cloneDoc.getRange().replace(run.getText(), run.getText(), options);
    }

    if (run.getFont().getStrikeThrough())
    {
        System.out.println("STRIKETHROUGH =>" + run.getText());
        cloneDoc.getRange().replace(run.getText(), "");
    }
}
cloneDoc.save("output.docx");

@mmetterle Your code is correct, but I think it is not required to use Find/Replace functionality. You can just change properties or remove the detected Run nodes. Please see the following code:

Document doc = new Document("C:\\Temp\\in.docx");
doc.joinRunsWithSameFormatting();

for (Run run : (Iterable<Run>)doc.getChildNodes(NodeType.RUN, true))
{
    if (run.getFont().getUnderline() != Underline.NONE)
    {
        System.out.println("UNDERLINE =>" + run.getText());
        run.getFont().setUnderline(Underline.NONE);
    }

    if (run.getFont().getStrikeThrough())
    {
        System.out.println("STRIKETHROUGH =>" + run.getText());
        run.remove();
    }
}
doc.save("C:\\Temp\\out.docx");

Awesome, I appreciate the very quick feedback on this Alexey! I’ll try that as well. I really appreciate your help on this. I assume we can do the same for detecting highlighting and changing the highlight color as well? I’ll experiment a bit as well.

@mmetterle Sure, you can use code like this to detect highlighted text:

if (run.getFont().getHighlightColor().getRGB() != 0)
{
    System.out.println("HIGHLIGHTED =>" + run.getText());
    run.getFont().setHighlightColor(new Color(255, 0, 0, 255));
}

Awesome, thank you so much! Our company is evaluating products to use for this purpose and so far we have been trying to do the same thing with docx4j. I am not entirely sure if that library is capable of doing the same thing easily, but so far this product is looking promising.

@mmetterle Please feel free to ask in case of any other question. We will be glad to assist you.

I appreciate that Alexey! I will definitely do that. Thanks again for everything!

1 Like