Check Consistent Usage of comma colon semicolon Punctuation in Word DOCX Document & Bulleted List Items using C# .NET

I have word document i need to verify the Punctuation is used consistently throught the document like space, comma, colon, semicolon and especially for bullet pointed text and listed items.

@knr,

You can determine the number of times a Punctuation character appears in entire Word document and inside List Items by using the following C# code:

Document doc = new Document(@"E:\Temp\LOREM IPSUM.docx");

char[] punctuations = new char[] { ' ', ':', ',', ';' };
string textOfDocument = doc.ToString(SaveFormat.Text);
foreach (char ch in punctuations)
{
    string[] splits = textOfDocument.Split(ch);
    Console.WriteLine(ch + " --> " + (splits.Length - 1));
}

Console.WriteLine("-------------------------");
Char aChar = punctuations[0];
foreach (Paragraph para in doc.GetChildNodes(NodeType.Paragraph, true))
{
    if (para.IsListItem)
    {
        Console.WriteLine(para.ToString(SaveFormat.Text).Substring(0, 10));
        string textOfListItem = para.ToString(SaveFormat.Text);
        string[] splits = textOfListItem.Split(aChar);
        Console.WriteLine(aChar + " --> " + (splits.Length - 1));
    }
}

Sample test document: Lorem ipsum.zip (19.8 KB)

Thank you providing the information.

But i need to verify the paragraphs which contain the proper format or not.

Please find the below paragraph–>Here after dot(.) or comma(,) space is there so i need to check entire document after dot(.) or comma(,) space is there or not?

Ex:
Specimens for fungal culture and other relevant laboratory studies (including histopathology) to isolate and identify causative organism(s) should be obtained prior to initiating antifungal therapy. Therapy may be instituted before the results of the cultures and other laboratory studies are known. However, once these results become available, antifungal therapy should be adjusted accordingly.

@knr,

You can build logic on the following code to meet this requirement:

Document doc = new Document("E:\\temp\\in.docx");

ArrayList listPassed = new ArrayList();
ArrayList listFailed = new ArrayList();
foreach (Paragraph para in doc.GetChildNodes(NodeType.Paragraph, true))
{
    int commaCount = para.Range.Replace(",", ",");
    int commaSpaceCount = para.Range.Replace(", ", ", ");

    if (commaCount == commaSpaceCount)
        // Console.WriteLine("Passed: We have a Space after every COMMA character");
        listPassed.Add(para);
    else
        listFailed.Add(para);

    ////to verify DOTs
    //int dotCount = para.Range.Replace(".", ".");
    //int dotSpaceCount = para.Range.Replace(". ", ". ");

    //if (dotCount == dotSpaceCount)
    //    Console.WriteLine("Passed: We have a Space after every DOT character");
}