MS Word - Would it be possible to Insert Bookmark programmatically(C#) in a selected row or position

Wanted to know using Aspose Docs or GroupDocs.Metadata would it be possible to Insert Bookmark programmatically(C#) in a selected row or position in a Word Document.

Checked this->Working with Bookmarks in C#|Aspose.Words for .NET

Our requirement is to remove rows from table which has 0 value for all columns. For example in this table we would want to remove programmatically the last row which has 0 value for both columns.

Income 2021 2020 <–Table header
Total real estate income 1000 2000 <–row with values
Revenue from export 0 0 <–row with empty value we want to remove

@qlnsubhash Sure, you can use DocumentBuilder.MoveToXXX methods to move document builder cursor and then DocumentBuider.StartBookmark and DocumentBuider.EndBookmark methods to insert a bookmark.

Regarding removing rows with zero values, you can use code like this:

Document doc = new Document(@"C:\Temp\in.docx");

// Get table.
Table table = doc.FirstSection.Body.Tables[0];

// Remove rows with zero values.
foreach (Row r in table.Rows)
{
    bool hasZeroValuesOnly = true;
    foreach (Cell c in r.Cells)
        hasZeroValuesOnly &= c.ToString(SaveFormat.Text).Trim().Equals("0");

    if (hasZeroValuesOnly)
        r.Remove();
}

doc.Save(@"C:\Temp\out.docx");

here is input and output documents: in.docx (13.3 KB) out.docx (10.3 KB)

In the logic we are looping through the table rows/cells. This would be a costly operation if we have hundreds of rows to loop through.
Instead could this below questions possible?

  1. If the 0 value rows[all columns have 0 value] are marked with a metadata (Example a token like “<<zero_val>>”) would it be possible to call a function to delete all the rows which has this metadata? This way this will be a one call and would not need to loop through all the rows and cells.

Will you let me know. Thank you so much for your response and would like to hear from you further.

@qlnsubhash You can use LINQ syntax to achieve this. For example see the following code which does the same as the code suggested in the previous answer:

Document doc = new Document(@"C:\Temp\in.docx");

// Remove all rows with zero values using LINQ
doc.GetChildNodes(NodeType.Row, true).Cast<Row>() // Get all rows from the document.
    .Where(r => r.Cells.All(c => c.ToString(SaveFormat.Text).Trim().Equals("0"))) // Select rows with zero values.
    .ToList().ForEach(r => r.Remove()); // Remove selected rows.

doc.Save("C:\\Temp\\out.docx");

If zero values rows in your input document will have some metadata, you can use another condition to to select rows with zero values.

Made this modification(c.PreviousSibling != null) in the for loop to avoid the first cell which would have description.

Aspose.Words.Document docu = new Aspose.Words.Document(@"in.docx");

foreach (Aspose.Words.Tables.Table table1 in docu.GetChildNodes(Aspose.Words.NodeType.Table, true))
{
    // Remove rows with zero values.
    foreach (Row r in table1.Rows)
    {
        bool hasZeroValuesOnly = true;
        foreach (Cell c in r.Cells)
            if (c.PreviousSibling != null)//Checking to avoid the first column as it would have description.
            {
                hasZeroValuesOnly &= c.ToString(SaveFormat.Text).Trim().Equals("0");
            }

        if (hasZeroValuesOnly)
            r.Remove();
    }
}

docu.Save(@"C:\Temp\out.docx");

Attached the in.docx and out.docx.
in.docx (16.4 KB)
out.docx (22.2 KB)

Have the below questions.

  1. In the LINQ query you have provided would it be possible to add this condition
    (c.PreviousSibling != null) to skip the first column of each row in the check for 0 value?
  2. Between the foreach and LINQ would we benefit anything from the performance point of view when we have 1000 or more rows to check.
  3. Other than this 2 approaches (foreach and LINQ) would any different approach like the below question can be possible to make the operation faster.
    –> 3a. If the 0 value rows[all columns have 0 value] are marked with a metadata (Example a token like “<<zero_val>>”) would it be possible to call a function to delete all the rows[similar to find and replace] which has this metadata? This way this will be a one call and would not need to loop through all the rows and cells.

@qlnsubhash

  1. Sure, you can add such condition in the LINQ syntax. You can use Except to achieve this. For example see the following code:
Document doc = new Document(@"C:\Temp\in.docx");

// Remove all rows with zero values using LINQ
doc.GetChildNodes(NodeType.Row, true).Cast<Row>() // Get all rows from the document.
    .Where(r => r.Cells.Except(new Node[] { r.FirstChild }).All(c => c.ToString(SaveFormat.Text).Trim().Equals("0"))) // Select rows with zero values.
    .ToList().ForEach(r => r.Remove()); // Remove selected rows.

doc.Save("C:\\Temp\\out.docx");
  1. You should compare performance with your test documents to select the approach that works better for your scenario.

  2. Anyways it will be required to find the rows with metadata. So I do not think find/replace approach or similar will give you a better performance.

To check and compare the performance benefit of using the LINQ or foreach I see a limitation of 3 pages when trying the code due to the trial limitation. Is there anyway to overcome this limitation as our reports may have 30 to 100 pages.
We are also currently using another library and wanted to compare the performance with using the Aspose.Words library. Will you let me know.

Thank you for your continued help.

@qlnsubhash You can request a temporary 30-days license to test Aspose.Words without evaluation version limitations. Once you get the temporary license, you should apply it using code like this:

Aspose.Words.License lic = new Aspose.Words.License();
lic.SetLicense(@"C:\Temp\Aspose.Words.NET.lic");

See our documentation for more information.

Also, there is one more approach you can use to remove rows with zero values. You can use DocumentVisitor. For example see the following code:

Document doc = new Document(@"C:\Temp\in.docx");
doc.Accept(new ZeroRowsRemover());
doc.Save(@"C:\Temp\out.docx");
private class ZeroRowsRemover : DocumentVisitor
{
    public override VisitorAction VisitCellEnd(Cell cell)
    {
        if (!cell.IsFirstCell)
            mIsZeroRow &= cell.ToString(SaveFormat.Text).Trim().Equals("0");

        return VisitorAction.Continue;
    }

    public override VisitorAction VisitRowEnd(Row row)
    {
        if (mIsZeroRow)
            row.Remove();

        mIsZeroRow = true;

        return VisitorAction.Continue;
    }

    private bool mIsZeroRow = true;
}