Step 1: Input Document
Inputfile.zip (19.3 KB)
consider tags details :
start tag: $$$abcstart{
end tag : }abcend$$$
Step 2: Find start and end tag and delete all content from start and end tag
Expected output: Outputfile.zip (18.4 KB)
Step 1: Input Document
Inputfile.zip (19.3 KB)
consider tags details :
start tag: $$$abcstart{
end tag : }abcend$$$
Step 2: Find start and end tag and delete all content from start and end tag
Expected output: Outputfile.zip (18.4 KB)
In your case, we suggest you please bookmark the content that you want to delete. You can use following steps to achieve your requirement.
Hope this helps you.
Source.zip (82.5 KB)
check attachment for sample code , input and current output and expected output
I tried adding bookmark but not working,Can you please help with sample code.
Following code example shows how to bookmark the desired content and remove them. Hope this helps you.
string StartTag = @"$abcstart{";
string EndTag = @"}abcend$";
Document doc = new Document(MyDir + "RegaxInputFile.docx");//Size 22k
FindReplaceOptions options = new FindReplaceOptions();
options.ReplacingCallback = new FindAndInsertBookmark("bookmark", true);
options.Direction = FindReplaceDirection.Backward;
options.MatchCase = false;
doc.Range.Replace(StartTag, "", options);
options.ReplacingCallback = new FindAndInsertBookmark("bookmark", false);
doc.Range.Replace(EndTag, "", options);
doc.UpdatePageLayout();
Bookmark bookmark = doc.Range.Bookmarks["bookmark"];
bookmark.Text = "";
doc.Save(MyDir + "20.6.docx");
public class FindAndInsertBookmark : IReplacingCallback
{
string bmname;
Boolean isStart;
DocumentBuilder builder;
public FindAndInsertBookmark(string bmname, Boolean isStart)
{
this.bmname = bmname;
this.isStart = isStart;
}
ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
{
// This is a Run node that contains either the beginning or the complete match.
Node currentNode = e.MatchNode;
if (builder == null)
builder = new DocumentBuilder((Document)currentNode.Document);
// The first (and may be the only) run can contain text before the match,
// in this case it is necessary to split the run.
if (e.MatchOffset > 0)
currentNode = SplitRun((Run)currentNode, e.MatchOffset);
ArrayList runs = new ArrayList();
// Find all runs that contain parts of the match string.
int remainingLength = e.Match.Value.Length;
while (
(remainingLength > 0) &&
(currentNode != null) &&
(currentNode.GetText().Length <= remainingLength))
{
runs.Add(currentNode);
remainingLength = remainingLength - currentNode.GetText().Length;
// Select the next Run node.
// Have to loop because there could be other nodes such as BookmarkStart etc.
do
{
currentNode = currentNode.NextSibling;
}
while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
}
// Split the last run that contains the match if there is any text left.
if ((currentNode != null) && (remainingLength > 0))
{
SplitRun((Run)currentNode, remainingLength);
runs.Add(currentNode);
}
if (isStart)
{
Run run = (Run)runs[0];
run.ParentNode.InsertBefore(new BookmarkStart(run.Document, bmname), run);
}
else
{
Run run = (Run)runs[0];
run.ParentNode.InsertAfter(new BookmarkEnd(run.Document, bmname), run);
}
// Signal to the replace engine to do nothing because we have already done all what we wanted.
return ReplaceAction.Skip;
}
/// <summary>
/// Splits text of the specified run into two runs.
/// Inserts the new run just after the specified run.
/// </summary>
private static Run SplitRun(Run run, int position)
{
Run afterRun = (Run)run.Clone(true);
afterRun.Text = run.Text.Substring(position);
run.Text = run.Text.Substring(0, position);
run.ParentNode.InsertAfter(afterRun, run);
return afterRun;
}
}
Thanks a lot its working for one start and end tag , but not supporting for duplicate whether it is possible ?
in input document like $abcstart{ some text table content }abcend$
have duplicates of same tag
$abcstart{ some text table content }abcend$
$abcstart{ some text table content }abcend$
$abcstart{ some text table content }abcend$
trying to delete one set of tags with same start and end tags. Please suggest code sample.
current code output is also not as expected output
should delete contain $abcstart{ some text table content }abcend$
but deleting till $ is still not deleted.
updated code Source.zip (82.6 KB)
We have modified the code according to your new requirement. We have attached the output document for your kind reference.
20.6.zip (24.9 KB)
string StartTag = @"$abcstart{";
string EndTag = @"}abcend$";
Document doc = new Document(MyDir + "RegaxInputFile.docx");
FindReplaceOptions options = new FindReplaceOptions();
options.ReplacingCallback = new FindAndInsertBookmark("bookmark", true);
options.Direction = FindReplaceDirection.Backward;
options.MatchCase = false;
doc.Range.Replace(StartTag, "", options);
options.ReplacingCallback = new FindAndInsertBookmark("bookmark", false);
doc.Range.Replace(EndTag, "", options);
doc.UpdatePageLayout();
foreach (Bookmark bookmark in doc.Range.Bookmarks)
bookmark.Text = "";
doc.Save(MyDir + "20.6.docx");
public class FindAndInsertBookmark : IReplacingCallback
{
string bmname;
int i = 1;
Boolean isStart;
DocumentBuilder builder;
public FindAndInsertBookmark(string bmname, Boolean isStart)
{
this.bmname = bmname;
this.isStart = isStart;
}
ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
{
// This is a Run node that contains either the beginning or the complete match.
Node currentNode = e.MatchNode;
if (builder == null)
builder = new DocumentBuilder((Document)currentNode.Document);
// The first (and may be the only) run can contain text before the match,
// in this case it is necessary to split the run.
if (e.MatchOffset > 0)
currentNode = SplitRun((Run)currentNode, e.MatchOffset);
ArrayList runs = new ArrayList();
// Find all runs that contain parts of the match string.
int remainingLength = e.Match.Value.Length;
while (
(remainingLength > 0) &&
(currentNode != null) &&
(currentNode.GetText().Length <= remainingLength))
{
runs.Add(currentNode);
remainingLength = remainingLength - currentNode.GetText().Length;
// Select the next Run node.
// Have to loop because there could be other nodes such as BookmarkStart etc.
do
{
currentNode = currentNode.NextSibling;
}
while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
}
// Split the last run that contains the match if there is any text left.
if ((currentNode != null) && (remainingLength > 0))
{
SplitRun((Run)currentNode, remainingLength);
runs.Add(currentNode);
}
if (isStart)
{
Run run = (Run)runs[0];
run.ParentNode.InsertBefore(new BookmarkStart(run.Document, bmname+i), run);
i++;
}
else
{
Run run = (Run)runs[0];
run.ParentNode.InsertAfter(new BookmarkEnd(run.Document, bmname + i), run);
i++;
}
// Signal to the replace engine to do nothing because we have already done all what we wanted.
return ReplaceAction.Skip;
}
/// <summary>
/// Splits text of the specified run into two runs.
/// Inserts the new run just after the specified run.
/// </summary>
private static Run SplitRun(Run run, int position)
{
Run afterRun = (Run)run.Clone(true);
afterRun.Text = run.Text.Substring(position);
run.Text = run.Text.Substring(0, position);
run.ParentNode.InsertAfter(afterRun, run);
return afterRun;
}
}
code is not working in all case
Sorry for confusion i have menioned the same tags which i am using, Please check attchement
code sample Source.zip (70.6 KB)
Im trying with below Case scenario
Case 1. try to delete tag string[0] having 4 duplicates in inputfile - currently have 4 duplicates and deleting only 2 for replicating use only string[0] for Case1
Case 2. try to delete two different tags data- string[0] with duplicate and string[1] without duplicate- not working in end some text still not deleted for replicating use string[0] and string [1] for Case2
It seems that you are using old version of Aspose.Words. We have tested the scenario using the latest version of Aspose.Words for .NET 20.6 and have not found the shared issue. Please check the attached output document. 20.6 (2).zip (22.4 KB)
Please update the following modified if else
code snippet in IReplacingCallback.Replacing.
if (isStart)
{
Run run = (Run)runs[0];
run.ParentNode.InsertBefore(new BookmarkStart(run.Document, bmname+i), run);
i++;
}
else
{
Run run = (Run)runs[runs.Count - 1];
run.ParentNode.InsertAfter(new BookmarkEnd(run.Document, bmname + i), run);
i++;
}
In your use cases, you need to bookmark the content and delete them using Bookmark.Text property. So, please make sure that you are inserting the BookmarkStart and BookmarkEnd nodes correctly. Please use the same approach shared in the above code examples to delete the content.
I updated with above code and version with 20.6.
Case 1: mutliple tags deleted perfectly
case 2: not working check for attchement please Source.zip (72.3 KB)
attchement which you shared alos having tag string [1] .
Im trying to delete string[0],[1],[3] in case 2
It is nice to hear from you that code works for this case.
You are iterate over tags and saving the document in for loop. Please save the document outside the loop.
For second and third tags, the bookmarks are not added into document because initialization of FindAndInsertBookmark sets the value of variable i to 1 and bookmarks are replaced.
To achieve your requirement, we suggest you following solution.
Hope this helps you.