How to distinguish text in paragraphs with text in shape?

I have a document that contains two paragraphs with the same text content: “C5df: sss”.
One of the paragraphs is in shape.
c1diff.docx (14.8 KB)

How do I distinguish which paragraph is in the form?

I was trying with:

if (paragraph.GetAncestor(NodeType.Shape) == null)
{
//do something
}

But it is not different.

@quanghieumylo, to detect whether the paragraph is inside a shape, you can check whether the parent node of that paragraph is of NodeType.Shape:

Document document = new Document("c1diff.docx");
foreach (Paragraph para in  document.GetChildNodes(NodeType.Paragraph, true)) {
    String text = para.GetText();

    if ((text.Contains("C5df: sss") && (para.ParentNode.NodeType == NodeType.Shape))) {
        Console.WriteLine($"{text.Trim()} is inside Shape");
    } else {
        Console.WriteLine(text.Trim());
    }
}
1 Like

Hey, man, are you sure about that? Because I can’t use your code, my program gives an error. I use Aspose.Words for.NET v23.5.0, and it has no code like you write.
Another thing I tried with:
paragraph.ParentNode.ParentNode.NodeType it returns “row”
paragraph.ParentNode.NodeType it returns “cell”

p/s:
I discovered that for each inline shape, Aspose Word will record two paragraphs:

  • 1 paragraph containing shapes
  • A second paragraph is in the shape.
    And the code you sent (when modified to paragraph.ParentNode.NodeType):
  • 1 paragraph containing shapes -> returns the result “cell”
  • 1 second paragraph is in the shape -> returns the result “shape”
    And that means there will be a paragraph that is very similar to the paragraph outside the shape, and I still can’t tell the difference.

@denis.shvydkiy, please help me!

@quanghieumylo, sorry, I posted the code for Aspose.Words for Java. It has been replaced with the .NET version.

Paragraph.GetText gets the text of all child nodes. That’s why you get the same text twice.

So this solution based on GetText is not the best. Could you please elaborate what you are trying to achieve, and I will try to find another solution.

1 Like

Thank you for listening and helping me.
I want to find all paragraphs that contain “C5df:sss”, but only consider paragraphs that are directly in the table.rows[1].cells[1]

For example, paragraphs into a shape in table.rows[1].cells[1] will not be considered.
For the file that I submitted, I need to detect that: in table.rows[1].cells[1] only paragraph 1 is satisfied, paragraphs 2 and 3 are not (because it is a paragraph into a shape).

@quanghieumylo, could you check if this code produces the desired results:

string pattern = "C5df: sss";

Document document = new Document("c1diff.docx");

SearchOnlyCallback searchCallback = new SearchOnlyCallback();
FindReplaceOptions searchOptions = new FindReplaceOptions
{
    ReplacingCallback = searchCallback
};

document.Range.Replace(pattern, "", searchOptions);


foreach (Paragraph para in searchCallback.Occurrences)
    Console.WriteLine(para.GetText());


internal class SearchOnlyCallback : IReplacingCallback
{
    public ReplaceAction Replacing(ReplacingArgs args)
    {
        Node node = args.MatchNode;

        Paragraph? parentParagraph = node.ParentNode.NodeType == NodeType.Paragraph? (Paragraph)node.ParentNode : null;
        Cell? parentCell = parentParagraph != null && parentParagraph.ParentNode.NodeType == NodeType.Cell ? (Cell)parentParagraph.ParentNode : null;

        if (parentCell != null)
            mOccurrences.Add((Paragraph)node.ParentNode);

        return ReplaceAction.Skip;
    }

    public List<Paragraph> Occurrences {
        get { return mOccurrences; }
    }

    private List<Paragraph> mOccurrences = new List<Paragraph>();
}
1 Like

You did it with the whole document:

document.Range.Replace(pattern, "", searchOptions);

Can I do the same with a void that returns true or false for each paragraph?
Eg:

        private bool check(Paragraph paragraph)
        {
            //code
             return true or false
        }

@quanghieumylo, after search is done, and all paragraphs containing “C5df:sss” are found, the program can iterate over all paragraphs, and the check function can simply test whether a paragraph is present in seachCallback.Occurences:

string pattern = "C5df: sss";

Document document = new Document("c1diff.docx");

SearchOnlyCallback searchCallback = new SearchOnlyCallback();
FindReplaceOptions searchOptions = new FindReplaceOptions
{
    ReplacingCallback = searchCallback
};

document.Range.Replace(pattern, "", searchOptions);

int i = 0;
foreach (Paragraph para in document.GetChildNodes(NodeType.Paragraph, true))
{
    Console.WriteLine($"para: {i}, parent: {para.ParentNode}, check result: {Check(para)}");
    i++;
}

bool Check(Paragraph para)
{
    return searchCallback.Occurrences.Contains(para);
}


internal class SearchOnlyCallback : IReplacingCallback
{
    public ReplaceAction Replacing(ReplacingArgs args)
    {
        Node node = args.MatchNode;

        Paragraph? parentParagraph = node.ParentNode.NodeType == NodeType.Paragraph? (Paragraph)node.ParentNode : null;
        Cell? parentCell = parentParagraph != null && parentParagraph.ParentNode.NodeType == NodeType.Cell ? (Cell)parentParagraph.ParentNode : null;

        if (parentCell != null)
            mOccurrences.Add((Paragraph)node.ParentNode);

        return ReplaceAction.Skip;
    }

    public List<Paragraph> Occurrences {
        get { return mOccurrences; }
    }

    private List<Paragraph> mOccurrences = new List<Paragraph>();
}

1 Like

Wow great. Thank you very much :smiling_face_with_three_hearts: