Replicate Content between tags in Word

Corsearch_IT · September 22, 2017, 9:09pm

I have certain tags in word document.
This tags are basically kind of bookmarks for start and end of the content.
I want to replicate the content whatever present in between these tags multiple times depending upon my collection.

I have created Sample document which shows two basic scenarios of the issue.
My start and end tag can be present in single table or contains bullets with static text OR any content for that matter.
I want to have logic which will copy the content in between tags and keep on repeating it.

I am not able to use Mail Merge/LINQ Engine feature since I want to process some of the data internally with my own logic.
For example, I want to merge ALL the cells into one depending upon business criteria in case of table.

Can this be done?
Template_List.zip (13.0 KB)

tahir.manzoor · September 24, 2017, 2:37pm

@psluzhevsky,

Thanks for your inquiry. Yes, you can achieve your requirement using Aspose.Words.

In this case, we suggest you following solution.

Please implement IReplacingCallback interface and find [TableStart] and [TableEnd].
In IReplacingCallback.Replacing, clone the table’s row that contains these tags and add it to existing table.
You can clone the row using Row.Clone(true) method and add the row to table by using Table.Rows.Add method. You can check either these tags are in table or not using Node.GetAncestor(NodeType.Table) method. This method gets the first ancestor of the specified NodeType.

Please refer to the following article.
Find and Replace

In this case, we suggest you following solution.

Please find the [TableStart] and [TableEnd] tags add bookmark to these tags.
Extract the contents between bookmarks.
After extracting the contents, move the cursor to the desired location and insert them using DocumentBuilder.InsertDocument.

Please check the code of ExtractContent and GenerateDocument methods.

Corsearch_IT · September 26, 2017, 3:16pm

For Bullet Listing, I have tried your approach of creating Bookmark and replicate the content between bookmark with my collection data.
I am able to find the tags, create bookmark and extract the nodes for the section to be replicated.
But remaining part is not working for me.

Here are the challenges:

I am not able insert extracted nodes back after each section with Document Builder since it is throwing exception for certain type of nodes.
Also these sections needs to be replicated as per collection. For example, if collection has 5 records, there will be 5 sections as output.
How can I replace my data tags with actual data when I replicate each section at a time before inserting nodes back to list.

For clarity, I have created console application with my example and test collection data.
This console has Input Template as well as Expected output document in Templates folder.

Please take a look.AsposeObjectConsole.zip (34.0 KB)

tahir.manzoor · September 26, 2017, 5:28pm

@psluzhevsky,

Thanks for your inquiry. In ReplaceTagsWithData, please replace following code snippet

//2. For each Report, replicate Section
foreach (ReportData data in mReports)
{
	mBuilder.MoveTo(bmEnd);
	foreach (var node in extractedNodes)
	{
		mBuilder.InsertNode((Node)node);
	}
}

with

Document dstDoc = AsposeHelper.GenerateDocument(mDoc, extractedNodes);
// Write your code here to replace the tags 
mBuilder.MoveTo(bmEnd);
mBuilder.InsertDocument(dstDoc, ImportFormatMode.KeepSourceFormatting);

Corsearch_IT · September 26, 2017, 7:56pm

I tried your code but there are issues in my scenario.

When I tried to add this logic with my collection, it has below issues:

Due to bookmark end, my sections are getting added in opposite direction.
There is lot of space/lines coming in between sections.
[[TableEnd]] Tag is coming at the end of all sections
Basically I want to remove original section including [[TableStart]] and [[TableEnd]] tags from the document itself once all the replicated sections are created. How can I do that?

I have attached output file and Expected output file here so that you can see the difference.
Zip also has changed cs file.
Templates.zip (26.6 KB)

Here is my changed code
ArrayList extractedNodes = AsposeHelper.ExtractContent(bmStart, bmEnd, true);

		Document sectionDoc = AsposeHelper.GenerateDocument(mDoc, extractedNodes);

		//Clean up Section by removing Start and End Tag
		FindReplaceOptions options = new FindReplaceOptions(FindReplaceDirection.Forward);
		sectionDoc.Range.Replace("[[TableStart]]", string.Empty, options);
		sectionDoc.Range.Replace("[[TableEnd]]", string.Empty, options);
		
		//2. For each Report, replicate Section
		foreach (ReportData data in mReports)
		{
			//Clone the Section so that it can be tag replaced before adding to Original Document
			Document clonedSectionDoc = sectionDoc.Clone();

			//3. For each tag in Replicated Section, replace them with actual data
			//For example, [[REG_NO]] will be replaced by data.RegNo and [[REG_Date]] will be replaced by data.RegDate
			clonedSectionDoc.Range.Replace("[[REG_NO]]", data.RegNo, options);
			clonedSectionDoc.Range.Replace("[[REG_DATE]]", data.RegDate, options);

			mBuilder.MoveTo(bmEnd);
			mBuilder.InsertDocument(clonedSectionDoc, ImportFormatMode.KeepSourceFormatting);
		}

tahir.manzoor · September 27, 2017, 6:48am

@psluzhevsky,

Thanks for your inquiry. In this case, we suggest you please remove the empty paragraphs before and end of document after extracting the content. You may also extract the contents between [[TableStart]] and [[TableEnd]] tags to get the desired output.

Document sectionDoc = AsposeHelper.GenerateDocument(mDoc, extractedNodes);
            
//Clean up Section by removing Start and End Tag
FindReplaceOptions options = new FindReplaceOptions(FindReplaceDirection.Forward);
sectionDoc.Range.Replace("[[TableStart]]", string.Empty, options);
sectionDoc.Range.Replace("[[TableEnd]]", string.Empty, options);

if (sectionDoc.FirstSection.Body.FirstParagraph.ToString(SaveFormat.Text).Trim() == "")
    sectionDoc.FirstSection.Body.FirstParagraph.Remove();

if (sectionDoc.LastSection.Body.LastParagraph.ToString(SaveFormat.Text).Trim() == "")
    sectionDoc.LastSection.Body.LastParagraph.Remove();

           
//2. For each Report, replicate Section
foreach (ReportData data in mReports)

Corsearch_IT · September 27, 2017, 2:13pm

Thanks a lot.
This resolved the unnecessary spacing between sections.
You made interesting point here.
How can I extract the content between [[TableStart]] and [[TableEnd]] by excluding these start/end tags?
Last parameter(isInclusive) of AsposeHelper.ExtractContent method should do the trick by providing false value but it is not working that way. It is giving me same nodes including start/end tags even if I send false. In both parameter values cases(true/false), extracted nodes are same. They include the start/end tags.
Am I missing something here?

Also how can I insert sections AFTER bookmark end?
Currently sections are getting inserted before [[TableEnd]] tag. I want them to be inserted AFTER [[TableEnd]] tag.

tahir.manzoor · September 27, 2017, 5:12pm

@psluzhevsky,

Thanks for your inquiry.

Please use following line of code to get the contents between [[TableStart]] and [[TableEnd]].

ArrayList extractedNodes = AsposeHelper.ExtractContent(bmStart.ParentNode.NextSibling, bmEnd.ParentNode.PreviousSibling, true);

Hope above line of code solves your query. If you still face problem, please share some more detail about this query. We will then provide you more information on this.

Corsearch_IT · September 27, 2017, 6:15pm

Sorry but this solution is not working in uniform manner.
This works well if my [[TableStart]] and [[TableEnd]] tags are on separate lines.
But if my tags are with content itself i.e. adjacent to content, it is not giving desired output.
For example, if I change my section as below, it messes up the output. I have also attached the changed Input document here too. Template_List_INPUT.zip (12.7 KB)

Second Example with Some static text with bullets and static text:
[[TableStart]]US State
• Latest Data Updation is done with [[REG_NO]]
Tested OK for [[REG_DATE]]. Also QA tested the scenario[[TableEnd]]
This is additional piece of information and out of Section so it should not be replicated.

tahir.manzoor · September 28, 2017, 6:22am

@psluzhevsky,

Thanks for your inquiry.

In this case, you can check either parent of bmStart (Paragraph node) contains any other content except [[TableStart]] or not. If Paragraph node contains any other content, please use bmStart in AsposeHelper.ExtractContent method. You can get the text of Paragraph node using Paragraph.ToString(SaveFormat.Text) mehtod. Please do the same for bmEnd node.

Corsearch_IT · September 28, 2017, 3:28pm

I tried this approach but it is also not working as expected.
In this case, it keeps [[TableStart]] node in extracted nodes list which is not correct.
But I kept my code of removing(cleaning) [[TableStart]] and [[TableEnd]] tags from my SectionDoc object.
This way I can keep it working in this scenario.

Here is the remaining part which you can provide help:

I want remove original section including [[TableStart]] and [[TableEnd]] tags.
I tried Node.Remove method but it is giving exception to me.
Also my section is always get adds up before [[TableEnd]].
This I could not figure out to be added AFTER that tag.

Can you provide code for this section deletion.
I have attached changed Code file and Input/output template again here.
Templates-Sep28.zip (27.4 KB)

tahir.manzoor · September 28, 2017, 5:27pm

@psluzhevsky,

Thanks for your inquiry. Please use following code before saving the document in GenerateReport method to get the desired output. Hope this helps you.

ExtractTags();
           
ReplaceTagsWithData();

mDoc.Range.Replace("[[TableStart]]", string.Empty, new FindReplaceOptions());
mDoc.Range.Replace("[[TableEnd]]", string.Empty, new FindReplaceOptions());

List list = null;
Paragraph para = (Paragraph)mDoc.Range.Bookmarks["TableBookmark"].BookmarkEnd.ParentNode;
if (para.IsListItem)
{
    list = para.ListFormat.List;
}
mDoc.Range.Bookmarks["TableBookmark"].Text = "";

if (list != null)
    ((Paragraph)mDoc.Range.Bookmarks["TableBookmark"].BookmarkEnd.ParentNode).ListFormat.List = list;

mDoc.Save(output);

Corsearch_IT · September 28, 2017, 9:22pm

I tried this solution and its working fine only for second case where my [[TableStart]] and [[TableEnd]] tags are adjacent to the content.
This does not work for first case where my tags are on separate lines.

For clarity, I have modified by input document which has both the cases shown.
Template_List_INPUT-BothCases.zip (12.8 KB)

Take a look and update.

tahir.manzoor · September 29, 2017, 4:24am

@psluzhevsky,

Thanks for your inquiry. The code works fine for both case. Please check the attached input and output documents.
Docs.zip (73.9 KB)

If both cases are in same document, you need to change the code accordingly. In this case, you need to create the two bookmarks e.g. TableBookmark1 and TableBookmark2. You need to extract the contents for both cases and set the bookmarks’ text to empty string before saving the document. We have attached the modified code for this case.
POCExample.zip (2.2 KB)

nangiladna · October 7, 2017, 5:21am

First heard of this word program, happy to try

nangiladna · October 7, 2017, 5:24am

I learn something good do you have auto spell check?

nangiladna · October 7, 2017, 5:27am

I like to start making word program work

tahir.manzoor · October 7, 2017, 5:17pm

@nangiladna,

Thanks for your inquiry.

Please use Run.Font.NoProofing property. The value of this property is True when the formatted characters are not to be spell checked.

Could you please share some more detail about this query? We will then provide you more information on this along with code.