Find text and replace it with Bookmark using Java

Hi ,

Please tell me is it possible to do my requirement using aspose.words java. I have a string contains a line. For eg: The potato price is ~priceValue~. In that string i would like to replace the “~priceValue~” with a bookmark. After, replacing the “~priceValue~” with a bookmark i will write the string concatenated with bookmark in a word document. Is it possible, if yes please tell me how to achieve this.

Thanks
karthikeyan

Hi Karthikeyan,

Thanks for your query. Please use following code snippet for your requirement. Hope this help you. Please let us know, If you have any more queries.

public void ReplaceTextWithBookMark() throws Exception
{
    Document doc = new Document("D:\\in.docx");
    Pattern regex = Pattern.compile("~priceValue~", Pattern.CASE\_INSENSITIVE);
    doc.getRange().replace(regex, new MyReplaceEvaluator(), true);
    doc.save("D:\\AsposeOut.docx");
}
private class MyReplaceEvaluator implements IReplacingCallback
{
    /**
    \* This is called during a replace operation each time a match is found.
    \* This method appends a number to the match string and returns it as a replacement string.
    \*/
    public int replacing(ReplacingArgs e) throws Exception
            {
                // This is a Run node that contains either the beginning or the complete match.
                Node currentNode = e.getMatchNode();
    // The first (and may be the only) run can contain text before the match,
    // in this case it is necessary to split the run.
    if (e.getMatchOffset() > 0)
    currentNode = splitRun((Run) currentNode, e.getMatchOffset());
            // This array is used to store all nodes of the match.
            ArrayList runs = new ArrayList();
            // Find all runs that contain parts of the match string.
            int remainingLength = e.getMatch().group().length();
    while ((remainingLength > 0) &&(currentNode != null) && (currentNode.getText().length() <= remainingLength))
    {
    runs.add(currentNode);
    remainingLength = remainingLength - currentNode.getText().length();
    // Select the next Run node.
    // Have to loop because there could be other nodes such as BookmarkStart etc.
    do
    {
        currentNode = currentNode.getNextSibling();
    }while ((currentNode != null) && (currentNode.getNodeType() != NodeType.RUN));
    }
    // Split the last run that contains the match if there is any text left.
    if ((currentNode != null) && (remainingLength > 0))
    {
        splitRun((Run)currentNode, remainingLength);
        runs.add(currentNode);
    }
    Document doc = (Document)e.getMatchNode().getParentNode().getDocument();
    DocumentBuilder builder = new DocumentBuilder(doc);
    builder.moveTo(currentNode);
    builder.startBookmark("MyBookmark");
    builder.write("Text inside a bookmark.");
    builder.endBookmark("MyBookmark");
    // Now remove all runs in the sequence.
    for (Run run : (Iterable)runs)
    {
        run.remove();
    }
    // Signal to the replace engine to do nothing because we have already done all what we wanted.
    return ReplaceAction.SKIP;
}

private Run splitRun(Run run, int position) throws Exception
{
    Run afterRun = (Run)run.deepClone(true);
    afterRun.setText(run.getText().substring(position));
    run.setText(run.getText().substring((0), (0) + (position)));
    run.getParentNode().insertAfter(afterRun, run);
return afterRun;
}
}

Hi Tahir,

Thanks for the code. But i already tried this code. May be I have to explain my requirement very clearly. The line “The potato price is ~priceValue~” will come from a db table column . It will be stored in a string. In that string i have to replace the ~price Value~ with the bookmark and i have write the replaced string with the bookmark in the document. This is the requirement. But the code you instructed will replace the ~price Value~ with the bookmark when we already wrote the line “The potato price is ~priceValue~” in a document and we have to aware/get the bookmarks in the document the bookmark from db. Then only we can able to use the code what you have sent.

what I feel is replace a string with bookmark is not possible? If possible please advise.

Please tell me the below scenario can be achievable or not.

Consider that i have a document with a multiple line. In that some of the line is having ~price1~, ~price2~ like this. for example: The price of tomato is ~price1~. The price of the onion is ~price2~. the price of the patato is ~price3~. Now, i have to read the above lines one by one from the document and got the index position of symbol ~ and i have to take the next immediate index pos of ~. so that i can get the bookmark name as price 1. Likewise, i have parse each line and rewrie the ~price 1~ with the bookmark in the position where ~price 1~ is placed in the document. I have to replace all keyword with the container of ~ symbol with the bookmark for the whole document. In that i have to neglect the new line i.e ( I have to avoid the empty lines.). Please, help me how to acheive this.

Thanks
karthikeyan.

Hi Karthikeyan,

Please accept my apologies for late response.

I have tried to understand your query and base on my understanding you need followings.

  1. You have a string “The price of tomato is ~price1~”, read from database, where price1 is bookmark in your document.
  2. You want to replace bookmark price1 in your document with text “The price of tomato is ~price1~”.
  3. After inserting text “The price of tomato is ~price1~” to bookmark price1, you do not want any empty lines.

Please confirm the above scenario so that we can share the code with you. Please share some more information about your issue, If I misunderstood your query. We are really keen to help you but need some more information from your side.

Hi Tahir,
Consider, i am having a document (“Test.docx”)in that there is a bookmark created by me namely bookmark1. Through the application, intially i will open this document(Test.docx). After that, i will fetch some value from the database column and i will store it in a java string variable. For ex:
String tempValue = “The potato price is ~priceValue~”;

Next, I have to replace(~priceValue~) in the tempValue string with the bookmark notation. For ex: the replaced value of the string should be
tempValue = "The potato price is "+[[bookmark Node]].

Please note here very carefully , i am replacing the ~priceValue~ with the bookmark node in the string itself. After that, Now i am going to write the tempValue in the document near (“bookmark1”). If you open and see the document(Test.docx) there should be The potato price is (here ms word bookmark symbol).

This is my requirement.

If you still have any complications in understanding the requirement please tell. I will explain.

Thanks & Regards
karthikeyan.

Hi Karthikeyan,

Thanks for sharing the more information.

String tempValue = "The potato price is ~priceValue~";

Output of tempValue is : The potato price is Aspose.Words.BookmarkStart

You can not achieve the above scenario by concatenating the string with bookmark Node and insert that string into to document. In your case, It seems that you want to insert a text at the position of Bookmark e.g bookmark1 and replace bookmark with some other bookmark node e.g [[bookmark Node]] .

You can achieve the required output by using following code sample. Hope this helps you. Please let us know, If you have any more queries.

Document doc = new Document("D:\\Data\\in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
//Get the bookmark where you want to insert the text
Bookmark bm = doc.getRange().getBookmarks().get("BM1");
//You can change the name of bookmark
bm.setName("BM1\_Changed");
//You can change the text of Bookmark
bm.setText("BookMark text Changed");
//Go to the bookmark node start
builder.moveTo(bm.getBookmarkStart());
//insert text, get from database
builder.write("The potato price is ");
doc.save("D:\\Data\\AsposeOut.docx");

Hi Tahir,

Thanks for your reply. Once again i need your help in the below scenario.

Consider a document is having the following sentence:

UK Gold Price is ~ukprice~.
(new line)
USA Gold Price is ~usaprice~.
(new line)
India Gold Price is ~indiaprice~.
(new line)

Step 1: I have to read the document line by line.
Step 2: Consider, first line is readed UK Gold Price is ~ukprice~. In this line i have get the string “~ukprice~”.
Step 3: Then i have to move to the position of the string ~ukprice~ in the document and have to create bookmark in the name of ukprice. i.e string in between the tilde(~) symbol is the bookmark name.

I have to read line one by one and replace the string with pre and post tilde symbol with the bookmark(~ukprice~).

Very important, i have to avoid the empty new lines. There may be one or many bookmark (i.e string covered with tilde symbol “~ukprice~”) in a single line.

Kindly, help me how to implement this.

Thanks
karthikeyan.

Hi Karthikeyan,

Thanks for sharing the information. I am working on your query and will update you asap.

Hi Karthikeyan,

Please use the following code snippet for your requirement. Please find the input and output documents in attachment.

Document doc = new Document("D:\\ in.docx");
Pattern regex = Pattern.compile("~(.\*?)~");
doc.getRange().replace(regex, new MyReplaceEvaluator(), true);
doc.save("D:\\ AsposeOut.docx");
    private class MyReplaceEvaluator implements IReplacingCallback
    {
        /**
        \* This is called during a replace operation each time a match is found.
        \* This method appends a number to the match string and returns it as a replacement string.
        \*/
        public int replacing(ReplacingArgs e) throws Exception
        {
            // This is a Run node that contains either the beginning or the complete match.
            Node currentNode = e.getMatchNode();
            // The first (and may be the only) run can contain text before the match,
            // in this case it is necessary to split the run.
            if (e.getMatchOffset() > 0)
                currentNode = splitRun((Run)currentNode, e.getMatchOffset());
            // This array is used to store all nodes of the match for further highlighting.
            ArrayList runs = new ArrayList();
            // Find all runs that contain parts of the match string.
            int remainingLength = e.getMatch().group().length();
            while (
                    (remainingLength > 0) &&
                            (currentNode != null) &&
                            (currentNode.getText().length() <= remainingLength))
            {
                runs.add(currentNode);
                remainingLength = remainingLength - currentNode.getText().length();
                // Select the next Run node.
                // Have to loop because there could be other nodes such as BookmarkStart etc.
                do
                {
                    currentNode = currentNode.getNextSibling();
                }
                while ((currentNode != null) && (currentNode.getNodeType() != NodeType.RUN));
            }
            // Split the last run that contains the match if there is any text left.
            if ((currentNode != null) && (remainingLength > 0))
            {
                splitRun((Run)currentNode, remainingLength);
                runs.add(currentNode);
            }
            Document doc = (Document)e.getMatchNode().getParentNode().getDocument();
            DocumentBuilder builder = new DocumentBuilder(doc);
            builder.moveTo(currentNode);
            builder.startBookmark(e.getMatch().group(0));
            builder.write(e.getMatch().group(0));
            builder.endBookmark(e.getMatch().group(0));
            // Now remove all runs in the sequence.
            for (Run run : (Iterable) runs)
            {
                run.remove();
            }
            // Signal to the replace engine to do nothing because we have already done all what we wanted.
            return ReplaceAction.SKIP;
        }

        private Run splitRun(Run run, int position) throws Exception
        {
            Run afterRun = (Run)run.deepClone(true);
            afterRun.setText(run.getText().substring(position));
            run.setText(run.getText().substring((0), (0) + (position)));
            run.getParentNode().insertAfter(afterRun, run);
            return afterRun;
        }
    }

Hi Tahir,

First of all, sorry for the delayed reply. Today, i have tested the code, it works fine.
I will write if i have any issues.

Thankyou Tahir.

Hi Karthikeyan,

Thanks for your feedback. Please let us know if you have any more queries. We are always glad to help you.

Hi Tahir,

Is there any way to avoid builder.write statement in between the startBookmark and endBookamrk.

Thanks
karthikeyan

Hi Karthikeyan,

Thanks for your query. It would be great if you please share some more information about your query.

Hi Tahir,

In the code while creating the bookmark we are writing some text in between the
startBookMark and endBookMark. For ex:

builder.startBookMark("price");
builder.write("priceValue"); \ I don’t want the priceValue to be written in between the bookmark.
builder.endBookMark("Price");

The above code will create a bookmark in the document as [priceValue]. I need to avoid the builder.write(“priceValue”) statement. I tried with empty statement(builder.write("")), but i am getting the exception.

Thanks
karthikeyan

Hi Karthikeyan,

Thanks for sharing the information. In your scenario, you do not need to use write("") method between BookmarkStart and BookmarkEnd. Please use the following code snippet to create Bookmark with empty text.

builder.endBookMark("Price");

Hope this answers your query. Please let us know if you have any more queries.

Hi Tahir,

If i remove the line i am getting an exception saying “Cannot remove because there is no parent”.

Please tell me how to overcome this.

Thanks
karthikeyan.

Hi Karthikeyan,

I have tested the shared code mentioned at this post and have found the exception. The exception is in the for loop where you are removing the Run nodes. Please modify your code as mentioned below :

builder.startBookmark(e.getMatch().group(0));
builder.endBookmark(e.getMatch().group(0));
// Now remove all runs in the sequence.
for (Run run : (Iterable<Run>)runs)
{
    if (run.toTxt().equals(""))
        run.remove();
}

If you still face problem, please share your document along with your code for investigation purposes.

Hi Tahir,

As you requested, Here is my main class.

public class Test
{
    public static void main(String[] args) throws Exception
    {
        Document doc = new Document("C:\Temp\Test.doc");
        Pattern regex = Pattern.compile("~(.\*?)~");
        doc.getRange().replace(regex, new ReplacingCallbackImpl(), true);
        doc.save("C:\Temp\Test.doc");
    }
}

ReplacingCallbackImpl --> Implementation

package com.csdcsystems.amanda.common;

import java.util.ArrayList;

import com.aspose.words.Document;
import com.aspose.words.DocumentBuilder;
import com.aspose.words.IReplacingCallback;
import com.aspose.words.Node;
import com.aspose.words.ReplaceAction;
import com.aspose.words.ReplacingArgs;
import com.aspose.words.Run;

public class ReplacingCallbackImpl implements IReplacingCallback
{
    public int replacing(ReplacingArgs e) throws Exception {
        Node currentNode = e.getMatchNode();
        if (e.getMatchOffset() > 0)
            currentNode = splitRun((Run)currentNode, e.getMatchOffset());

        ArrayList runs = new ArrayList();
        int remainingLength = e.getMatch().group().length();

        while (
                (remainingLength > 0) &&
                        (currentNode != null) &&
                        (currentNode.getText().length() <= remainingLength)){

            runs.add(currentNode);
            remainingLength = remainingLength - currentNode.getText().length();
        }

        if ((currentNode != null) && (remainingLength > 0)){
            splitRun((Run)currentNode, remainingLength);
            runs.add(currentNode);
        }

        Document doc = (Document)e.getMatchNode().getParentNode().getDocument();
        DocumentBuilder builder = new DocumentBuilder(doc);
        builder.moveTo(currentNode);
        String name = e.getMatch().group(0).substring(1,e.getMatch().group(0).length()-1);
        builder.startBookmark(name);
        // builder.write(e.getMatch().group(0));
        builder.endBookmark(name);

        for (Run run : (Iterable) runs){
            if(run.getText().equals(""))
                run.remove();
        }
        return ReplaceAction.SKIP;
    }

    private static Run splitRun(Run run, int position) throws Exception {
        Run afterRun = (Run)run.deepClone(true);
        afterRun.setText(run.getText().substring(position));
        run.setText(run.getText().substring((0), (0) + (position)));
        run.getParentNode().insertAfter(afterRun, run);
        return afterRun;
    }
}

Please help with this. I had the attached the document too with the post.

Thanks
karthikeyan. R

Hi Karthikeyan,

Thanks for sharing the information. Please use the following code snippet and find input and output documents in attachment.

public class ReplacingCallbackImpl implements IReplacingCallback
{
    public int replacing(ReplacingArgs e) throws Exception {
        Node currentNode = e.getMatchNode();
        if (e.getMatchOffset() > 0)
            currentNode = splitRun((Run)currentNode, e.getMatchOffset());
        ArrayList runs = new ArrayList();
        int remainingLength = e.getMatch().group().length();
        while (
                (remainingLength > 0) &&
                        (currentNode != null) &&
                        (currentNode.getText().length() <= remainingLength)){
            runs.add(currentNode);
            remainingLength = remainingLength - currentNode.getText().length();
        }
        if ((currentNode != null) && (remainingLength > 0)){
            splitRun((Run)currentNode, remainingLength);
            runs.add(currentNode);
        }
        Document doc = (Document)e.getMatchNode().getParentNode().getDocument();
        DocumentBuilder builder = new DocumentBuilder(doc);
        builder.moveTo(currentNode);
        String name = e.getMatch().group(0).substring(1,e.getMatch().group(0).length()-1);
        builder.startBookmark(name);
        // builder.write(e.getMatch().group(0));
        builder.endBookmark(name);
        String txt = e.getMatch().group(0).toString().trim();
        matchtxt.add(txt);
        // for (Run run : (Iterable) runs){
        // run.remove(); 
        //} 
        return ReplaceAction.SKIP;
    }
    private Run splitRun(Run run, int position) throws Exception {
        Run afterRun = (Run)run.deepClone(true);
        afterRun.setText(run.getText().substring(position));
        run.setText(run.getText().substring((0), (0) + (position)));
        run.getParentNode().insertAfter(afterRun, run);
        return afterRun;
    }
}
ArrayList matchtxt = new ArrayList();
Document doc = new Document("D:\\ in.docx");
Pattern regex = Pattern.compile("~(.\*?)~");
doc.getRange().replace(regex, new ReplacingCallbackImpl(), true);
for (int i = 0; i < matchtxt.size(); i++)
{
    doc.getRange().replace(matchtxt.get(i).toString(), "", false, false);
}
doc.save("D:\\ AsposeOut.docx");

Please let us know if you have any more queries.

Hello, I have a similar functionality that requires to replace text with bookmark. I’m using Aspose Android via Java. The code works fine except that in certain scenario, it will have an error like this, depending on the text length. One of the sample text I’ve used is this “%SIGNATUREBG%Hellohi%SIGNATUREBG%”
“java.lang.StringIndexOutOfBoundsException: length=13; index=20”

The exception is thrown on this line of code:

afterRun.text = run.text.substring(position)

On the first time it replaces, it is fine. It is the 2nd time when it tries to replace, then it will throw the exception.

Here are the code snipplets in Kotlin:

val signatureBookmarkOptions = FindReplaceOptions()
signatureBookmarkOptions.direction = FindReplaceDirection.FORWARD
signatureBookmarkOptions.replacingCallback = ReplacingCallbackImpl()
 
templateDocument.range.replace("%SIGNATUREBG%",
    "",
    signatureBookmarkOptions,
)
ReplacingCallbackImpl
internal class ReplacingCallbackImpl : IReplacingCallback {
    var i = 0
 
    @Throws(java.lang.Exception::class)
    override fun replacing(e: ReplacingArgs): Int {
        var currentNode = e.matchNode
        if (e.matchOffset > 0) {
            currentNode = splitRun(currentNode as Run, e.matchOffset)
        }
        val runs = ArrayList<Any>()
 
        // Find all runs that contain parts of the match string.
        var remainingLength = e.match.group().length
        while (remainingLength > 0 && currentNode != null &&
            currentNode.text.length <= remainingLength) {
            runs.add(currentNode)
            remainingLength -= currentNode.text.length
 
            do {
                currentNode = currentNode!!.nextSibling
            } while (currentNode != null && currentNode.nodeType !== NodeType.RUN)
        }
 
        if (currentNode != null && remainingLength > 0) {
            splitRun(currentNode as Run, remainingLength)
            runs.add(currentNode)
        }
        val builder = DocumentBuilder(e.matchNode.document as Document)
        builder.moveTo(runs[0] as Run)
        val name = e.match.group(0).substring(1, e.match.group(0).length - 1) + i++
        builder.startBookmark(name)
        builder.endBookmark(name)
        for (run in runs as Iterable<Run>) {
            run.remove()
        }
        return ReplaceAction.SKIP
    }
 
    @Throws(java.lang.Exception::class)
    private fun splitRun(run: Run, position: Int): Run? {
        val afterRun = run.deepClone(true) as Run
        afterRun.text = run.text.substring(position)
        run.text = run.text.substring(0, position)
        run.parentNode.insertAfter(afterRun, run)
        return afterRun
    }
}

What I’m trying to achieve is to convert “%SIGNATUREBG%Hellohi%SIGNATUREBG%” to “Hellohi”, but inserting 2 bookmarks respectively before and after “Hellohi”.
Please advise, thank you :).