Read List labels from Word document using Aspose.NET

Hi Team,
How to read a list of bullets and carriage return character from word document.
if I use MS word style of bullets in a word,
I cannot find it with run text,
How can I read bullets in program in asp .net.
Examples Text:
The Page Header contains the following:

• Operator: Operator
• Timebase: _DIONEX_HPLC_PHX
• Sequence: CARAWAY LC 20190523
• Page 1-1
• Today’s date in mm/dd/yyyy format
• Time printed in HH:MM <AM/PM>

Please find the attachment for sample word document text image for your reference.

Sample Word image.png (122.6 KB)

@thiru1711

Please use Paragraph.IsListItem property to check either paragraph is an item in a bulleted or numbered list.

To get the string representation of list label, please use ListLabel.LabelString property. Please check the following code example. Hope this helps you.

Document doc = new Document(MyDir + "iniput.docx");
doc.UpdateListLabels();
int listParaCount = 1;

foreach (Paragraph paragraph in doc.GetChildNodes(NodeType.Paragraph, true).OfType<Paragraph>())
{
    // Find if we have the paragraph list. In our document our list uses plain arabic numbers,
    // which start at three and ends at six.
    if (paragraph.ListFormat.IsListItem)
    {
        Console.WriteLine("Paragraph #{0}", listParaCount);

        // This is the text we get when actually getting when we output this node to text format. 
        // The list labels are not included in this text output. Trim any paragraph formatting characters.
        String paragraphText = paragraph.ToString(SaveFormat.Text).Trim();
        Console.WriteLine("Exported Text: " + paragraphText);

        ListLabel label = paragraph.ListLabel;
        // This gets the position of the paragraph in current level of the list. If we have a list with multiple level then this
        // will tell us what position it is on that particular level.
        Console.WriteLine("Numerical Id: " + label.LabelValue);

        // Combine them together to include the list label with the text in the output.
        Console.WriteLine("List label combined with text: " + label.LabelString + " " + paragraphText);

        listParaCount++;
    }
}

Hi @tahir.manzoor ,
How to Read bullets lists of text form Table in word document using aspose word in .Net

how to read text cell by cell with style format , need to inset in Data base table like SQL server table.

Please find the sample document for your reference : Sample Document 1.zip (35.5 KB)

@thiru1711

Please use the following code example to get the desired output.

Document document = new Document(MyDir + "Sample Template.docx");
document.UpdateListLabels();
foreach (Table table in document.GetChildNodes(NodeType.Table, true))
{
    foreach (Cell cell in table.GetChildNodes(NodeType.Cell, true))
    {
        foreach (Paragraph paragraph in cell.Paragraphs)
        {
            if (paragraph.IsListItem)
            {
                Console.WriteLine(paragraph.ListLabel.LabelValue);
                Console.WriteLine(paragraph.ToString(SaveFormat.Text));
                Console.WriteLine("----------------------------------");
            }
        }
    }
}

Hi @tahir.manzoor,
Thanks for Replay, Bullets and numbering text working fine when reading text form Table in word document using aspose word in .Net

but symbol and special characters text not working, so How to Read symbol and special characters of text from Table in word document using aspose word in .Net

Please find the sample document for your reference:Sample Doc1.zip (24.7 KB)

Sample Screenshots:Sample image.png (109.8 KB)

Below sample coding we are using:

Aspose.Words.Document document = new Aspose.Words.Document(@“D:\FileTest\Special Character123.docx”);
document.UpdateListLabels();

            foreach (Aspose.Words.Tables.Table table in document.GetChildNodes(Aspose.Words.NodeType.Table, true))
            {
                foreach (Aspose.Words.Tables.Row Row in table.GetChildNodes(Aspose.Words.NodeType.Row, true))
                {
                    foreach (Aspose.Words.Tables.Cell cell in Row.GetChildNodes(Aspose.Words.NodeType.Cell, true))

                    {
                        var text = cell.ToString(Aspose.Words.SaveFormat.Text);
                       

                    }
                }

            }

@thiru1711

Please note that Aspose.Words mimics the behavior of MS Word. If you export the bullet and number list to TXT format using MS Word, you will get the same output.

Could you please ZIP and attach your expected TXT output here for our reference? We will then provide you more information about your query

@tahir.manzoor
How to Read symbol and special characters of text from Table in word document using aspose word in .Net

Please find the sample document and Expected output document for your reference : Sample Doc and Expected output doc.zip (467.0 KB)

@thiru1711

In case you are using old version of Aspose.Words, we suggest you please use the latest version of Aspose.Words for .NET 20.2.

Moreover, you are facing the expected behavior of Aspose.Words. You shared the expected output in Word document. The text file format does not support such rich content supported by MS Word. To check this behavior, please copy the content of each table’s cell and paste them into Notepad.

Hope this answers your query. Please let us know if you have any more queries.