Find text and replace it with RTF document using Java

Thamizh · June 5, 2012, 9:38am

HI,

Thanks for your reply.

Regarding to your previous mail, i have attached the sample codes and input, rtf and output documents for your reference.

Please see the RED LINES i have mentioned in the Output document(Output.docx) after insertion of RTF document (bw_raw.rtf) into the original document (Input.docx).

I just tried to replace the text “bw_raw_rpt” in the input.docx and instead of that particular text i inserted the RTF document (bw_raw_rtf) by replacing that text. But you can see in the output document (Output.docx) the contents in the RTF document has been mingled(collapsed) with the previous and next paragraph.

I want the clear output that the RTF document should be inserted by replacing the particular text without disturbing the previous or next paragraph or it should not collapsed with the previos page or next page.,

Please do the needfull.

Thanks,

Thamizh

tahir.manzoor · June 6, 2012, 6:15am

Hi Thamizh,

Thanks for your query. I have worked with your shared document and found that this is not an issue. Please note that Aspose.Words layout engine tries to mimic the way the Microsoft Word’s layout engine works.

Please copy the contents of RTF file and past under {bw_raw_rpt,rot=300L} in input.docx file. You will get the same output.

You can use INCLUDETEXT field or insert text from file by using MS word, you will get the same output. However, I am working with your documents and will let you know about the solution of this issue asap.

Hope this answers your query. Please let us know, If you have any more queries.

Thamizh · June 7, 2012, 4:05am

I want to insert the RTF document by replacing the text using aspose.words for java.

Then why should i copy the contents of the file and have to paste in the input document??? my task is to insert the file using java.. thats why we use aspose.

It is clear that while inserting the documents it has been interrupting with the previous and next paragraphs also.

You please investigate on this and revert me as soon as possible..

I need your immediate help.

Thanks,

Thamizh

tahir.manzoor · June 8, 2012, 3:34am

Hi Thamizh,

Please accept my apologies for your inconvenience. This is not an issue with Aspose.Words as I shared with you earlier that that Aspose.Words layout engine tries to mimic the way the Microsoft
Word’s layout engine works. I will try to find out the solution for this scenario and update you asap.

Thamizh · June 12, 2012, 4:40am

May i know the status of the above query… Please investigate with the documents and reply…

tahir.manzoor · June 13, 2012, 4:46am

Hi Thamizh,

Please accept my apologies for late response.

I have worked with your Docx and RTF documents in detail. The contents of Docx are messed up after inserting RTF into Docx due to floating text frames. As I shared with you earlier, this is not an issue. Please check my reply at following link.

https://forum.aspose.com/t/57478

We apologies for your inconvenience.

Thamizh · June 13, 2012, 7:09am

Sir then what is the solution for this problem… i dont understand what is floating text frames… i want to replace the particular text without disturbing the previous or next paragraphs or pages… just the document should be inserted without any changes in the DOCX … can u assist me with some code clarifications… or give any other sample code for inserting the documents without any problem…

tahir.manzoor · June 13, 2012, 8:08am

Hi Thamizh,

Your RTF file contains text in the form of text frame. You can select and change the position of text frames. Please see the attached images (Floating_Text_Frame_1.png and Floating_Text_Frame_2.png).

Please note that Aspose.Words layout engine tries to mimic the
way the Microsoft Word’s layout engine works. Please open your input Docx file by using MS Word and put cursor under {bw_raw_rpt,rot=300L}. Select the option “insert text from file” from insert tab. Insert your RTF file and see the MS Words behavior.

Aspose.Words and MS Words have same behavior in your scenario. Please use RTF file without text frames to insert it in docx file.

Hope this answers your query. Please let us know, If you have any more queries.

Thamizh · June 13, 2012, 9:02am

I understand now about the problem now. I really thank for the effort you have taken to resolve my issue.

As i am not supposed to change the frame position manually in the DOCX file since i have to insert the RTF to the DOCX input file through java code using aspose.

Is it possible to give page breaks in start and end of the RTF document and then i insert the RTF file to DOCX. So that i could get the required output…when i tried to insert page breaks using builder.insertBreak() it is not working fine … i have tried all the possible ways. I could not stop the mess up…

can you check with my code whether i can insert RTF document with page breaks at the starting and end.

I am fully deponding upon you to resolve this issue.

Regards,

Thamizh

tahir.manzoor · June 14, 2012, 2:06am

Hi Thamizh,

The insert page break do not solve this issue. You can solve this issue by appending an empty document before RTF. Please see the code below:

private class MyReplaceEvaluatorRTF implements IReplacingCallback

{

/**

* This is called during a replace operation each time a match is found.

* This method appends a number to the match string and returns it as a replacement string.

*/

public int replacing(ReplacingArgs e) throws Exception

{

Document docEmpty = new Document();

Document rtfDoc = new Document("D:\\bw_raw.rtf");

docEmpty.appendDocument(rtfDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);

// This is a Run node that contains either the beginning or the complete match.

Node currentNode = e.getMatchNode();

// The first (and may be the only) run can contain text before the match,

// in this case it is necessary to split the run.

if (e.getMatchOffset() > 0)

currentNode = splitRun((Run)currentNode, e.getMatchOffset());

// This array is used to store all nodes of the match for further highlighting.

ArrayList runs = new ArrayList();

// Find all runs that contain parts of the match string.

int remainingLength = e.getMatch().group().length();

while (

(remainingLength > 0) &&

(currentNode != null) &&

(currentNode.getText().length() <= remainingLength))

{

runs.add(currentNode);

remainingLength = remainingLength - currentNode.getText().length();

// Select the next Run node.

// Have to loop because there could be other nodes such as BookmarkStart etc.

do

{

currentNode = currentNode.getNextSibling();

}

while ((currentNode != null) && (currentNode.getNodeType() != NodeType.RUN));

}

// Split the last run that contains the match if there is any text left.

if ((currentNode != null) && (remainingLength > 0))

{

splitRun((Run)currentNode, remainingLength);

runs.add(currentNode);

}

Document doc = (Document)e.getMatchNode().getParentNode().getDocument();

DocumentBuilder builder = new DocumentBuilder(doc);

builder.moveTo(currentNode);

builder.insertBreak(BreakType.PAGE_BREAK);

insertDocument((Paragraph)currentNode.getParentNode(), docEmpty);

// Now remove all runs in the sequence.

for (Run run : (Iterable) runs)

{

run.remove();

}

// Signal to the replace engine to do nothing because we have already done all what we wanted.

return ReplaceAction.SKIP;

}

private Run splitRun(Run run, int position) throws Exception

{

Run afterRun = (Run)run.deepClone(true);

afterRun.setText(run.getText().substring(position));

run.setText(run.getText().substring((0), (0) + (position)));

run.getParentNode().insertAfter(afterRun, run);

return afterRun;

}

}

Thamizh · June 20, 2012, 12:40am

Hi Tahir,

Thank you so much. At last it worked. This code is very useful to insert the Document without any distortion.

I really really appreciate the support provided by you from the beginning. I wont forget. Thanks once again.

Regards,

Thamizh