Removing content between two tags throws exception using C#

Hello, I am using a temp license for our client.

Im trying to delete any content between tags , from starttag to endtag, in c# net

I´m trying to: (as seen in Remove text between two words)

  1. Implement IReplacingCallback interface and use Range.Replace method to find the text “<text1_start>”.
  2. In the IReplacingCallback.Replacing , insert the BookmarkStart node of a bookmark e.g. bookmark1 at the position of matched node.
  3. Similarly, find the text “<text1_end>” and insert BookmarkEnd node for same bookmark.
    4). Use Bookmark.Text to set the bookmark text to empty string to remove the contents of bookmark.

I can already manage to find a text and insert a bookmarkStart using what I found at https://docs.aspose.com/words/net/find-and-replace/

The problem is when I make a second search (of endtag in order to insert bookmarkend), it throws an Exception: “aspose startindex cannot be larger than lengh of string”

This is thrown at SplitRun function at “run.Text.Substring(0, position)”.

I seached a lot for this problem and I only found https://forum.aspose.com/t/range-replace-question/122304 wich is obsolete.

please can you help me?

Thank you

@Rodes

To ensure a timely and accurate response, please attach the following resources here for testing:

  • Your input Word document.
  • Please create a standalone console application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

Hello @tahir.manzoor, and many thanks for your quick answer.

I upload a zip with a proyect with test proyect too, and a test document

this document has #~A~# as tag, and #~F_A~# as endtag , where the tag is an ‘A’

When I search for words ‘inside’ the text, it finds it correctly. The problem is when i search a word (any word) at the end of a paragraph.

Is it possible that it doesn´t magane well runs so it throws exception when it does´t find any more text?

Thanks

@Rodes

We have not found any ZIP file in this forum thread. Please ZIP and attach your document and code example for testing. Thanks for your cooperation.

Sorry, file too big, I can´t upload it. I uploaded it to my Google Drive, here you have the link. https://drive.google.com/open?id=179yh3MRXB7MsrfDWTrYfWd4bETH3D2Ca

I found the problem: when I search a word that is at the end of a paragrafh, text… at the end of a node… it doen´t handle runs correcty.

Hello again, I finally solved the problem, here you have the zip in case it could help you https://drive.google.com/open?id=179yh3MRXB7MsrfDWTrYfWd4bETH3D2Ca

Summary:
I insert beginBookmark and EndBookmark in every coincidence of begintag and endtag. this bookmark will be called MARCADOR. Afterwards I make a foreach in every bookmark of the document and erase thouse called with this name.

    void borraEtiqueta(string et)
    {
        DocumentBuilder builder = new DocumentBuilder(doc);

        FindReplaceOptions options = new FindReplaceOptions();
        options.ReplacingCallback = new Reemplazador(builder);
        options.Direction = FindReplaceDirection.Backward;

        //mete marcador de inicio y de fin en cada etiqueta de apertura y cierre 
        Regex rg = new Regex("#~"+et+"~#", RegexOptions.IgnoreCase);
        doc.Range.Replace(rg, "inicio", options);
        Regex rg2 = new Regex("#~/" + et + "~#", RegexOptions.IgnoreCase);
        doc.Range.Replace(rg2, "fin", options);
        //borra todos los marcadores creados
        foreach (Bookmark bookmark in doc.Range.Bookmarks)
        {
            if (bookmark.Name == "MARCADOR")
            {
                bookmark.Text = string.Empty;
                bookmark.Remove();
            }
        }

    }

In Reemplazador()

      // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }
        ((Run)runs[0]).Text = e.Match.Value;
        for (int i = 1; i < runs.Count; i++)
        {
            ((Run)runs[i]).Remove();
        }

        foreach (Run run in runs)
        {
            if (e.Replacement == "inicio")
            {
                Run runafter = (Run)run.Clone(true);
                runafter.Text = "";
                if (run.ParentNode != null)
                {
                    run.ParentNode.InsertBefore(runafter, run);
                    builder.MoveTo(runafter);
                    builder.StartBookmark("MARCADOR");
                }
            }
            if (e.Replacement == "fin")
            {
                Run runafter = (Run)run.Clone(true);
                runafter.Text = "";
                if(run.ParentNode!=null)
                {
                    run.ParentNode.InsertAfter(runafter, run);
                    builder.MoveTo(runafter);                    
                    builder.EndBookmark("MARCADOR");
                }
            }
        }

        return ReplaceAction.Skip;
    }

Thank you and greetings from Spain :slight_smile:

@Rodes

It is nice to hear from you that you have found the solution of your query. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.

1 Like

Tahir, I downgraded my aspose.words nugget package to v17.3 (that is the version my client uses) to prevent compatibility issues, and my solution about deleting content between tags didn´t work (it didn´t delete the last tag, I - guess it´s all about runs handle - ).

I upgraded again to last version (20.x) and it works perfectly again.

Here isi my question: I suppone my client license is valid for this version too (I asked it on support mail), buy I´m afraid that some other functionalities stop working. Will everything be compatible with this new version?

Thanks in advance

@Rodes

You can check the expiry date of your license by opening the license file in notepad. You will see the following tag in your license file:

<SubscriptionExpiry>20200925</SubscriptionExpiry>

It means that you can free upgrade to a version of Aspose.Words that is published before 09/25/2020.

You are using older version of Aspose.Words. If you are using any deprecated APIs, please update your code according to latest APIs. We suggest you please check the API changes from here:
Release notes of Aspose.Words for .NET