Greetings, everyone.
I,m trying to perform text replacements, but I encountered a problem:
if the captured text contains \n
, \r
, \t
, or other special (escaped?) characters, the replacement doesn’t work. For example:
paragraph.getRange().replace("PUBLIC \r\r", replacementText, options); // Does not work, even though paragraph.getText() = "PUBLIC \r\r"
How can I achieve this?"
@MindControl
To handle special characters like \n
, \r
, and \t
in text replacements using Aspose.Words, you need to ensure that the replacement method can recognize these characters correctly. The Range.Replace
method should be able to process these special characters if they are properly formatted in the search string.
Here’s a general approach you can take:
-
Use Regular Expressions: If you are not already using regular expressions, consider enabling this feature. Regular expressions can help match special characters more effectively. You can set the FindReplaceOptions
to use regular expressions.
-
Escape Special Characters: When using special characters in your search string, ensure they are escaped properly. For example, you might need to use double backslashes (\\n
, \\r
, \\t
) in your search string to represent these characters correctly.
-
Example Code: Here’s an example of how you might implement this:
// Create FindReplaceOptions and enable regular expressions
FindReplaceOptions options = new FindReplaceOptions();
options.setUseRegularExpressions(true);
// Replace text with special characters
String searchText = "PUBLIC \\r\\r"; // Escape special characters
String replacementText = "NEW TEXT";
// Perform the replacement
paragraph.getRange().replace(searchText, replacementText, options);
By following these steps, you should be able to perform text replacements that include special characters without issues. If you continue to experience problems, double-check the formatting of your search string and ensure that the text you are trying to replace matches exactly what is in the document.
If you need further assistance, feel free to ask!
@MindControl To replace line and paragraph breaks you should use special metacharacters like &l
and &p
. Please see our documentation for more information
https://reference.aspose.com/words/java/com.aspose.words/range/#replace-java.lang.String-java.lang.String
1 Like
Thank you Alexey.
Is there some kind of lookup table for replacements?
For example, ‘\r’ → &p
(paragraph break).
Also, I also noticed that even \r\r
works when replaced with just one &p
(paragraph break).
@MindControl Actually, in MS Word documents there is no paragraph break character, since paragraphs are represented as separate nodes in the DOM. Could you please attach your input document and text you would like to replace? We will check them and provide you more information.
1 Like
For exampe I try to replace the text:
PUBLIC \rRESTRICTED\r\r
with
PUBLIC &pRESTRICTED&p
and it doesn’t work.
Also, how to deal with vertical tabs (\x0b)?
Example:
Text \x0bSource\r
@MindControl Could you please attach your input and expected output documents? So we can test your scenario on our side.
1 Like
input.docx (21.1 KB)
output.docx (21.1 KB)
Please review the input file and the expected output file. Ensure that the footers are taken into consideration.
@MindControl You can use code like this to get the expected output:
Document doc = new Document("C:\\Temp\\in.docx");
FindReplaceOptions opt = new FindReplaceOptions();
opt.setUseSubstitutions(true);
doc.getRange().replace(Pattern.compile("Source(\\d+)", Pattern.CASE_INSENSITIVE), "Replaced$1", opt);
doc.save("C:\\Temp\\out.docx");
1 Like
Alexey, thanks for the code provided.
However, the issue isn’t with the numbers.
The problem is that I first extract the text from the paragraph using method:
paragraph.toString(SaveFormat.TEXT)
and it gets pulled in this form:
Source1. \x0bSource\r
;
SOURCE4 \r\nSOURCE3\r\n\r\n
Then, I try to use this text as a key for replacement, but it doesn’t work:
paragraph.getRange().replace("SOURCE4 \r\nSOURCE3\r\n\r\n", replacementText, options); // doesn't work
Following your advice, I replaced some of the escaped characters (\n
, \r
, etc.) with a paragraph break &p
, and it works inconsistently:
paragraph.getRange().replace("SOURCE4 &pSOURCE3&p", replacementText, options); // works sometimes
I haven’t fully grasped the logic yet. Additionally, it’s unclear how to handle other special characters that appear in the text (\x0b
).
@MindControl \x0b
is a soft line break (Shift+Enter), you can use &l
metacharacter to replace it.
Document doc = new Document("C:\\Temp\\in.docx");
doc.getRange().replace("Source1. &lSource2: 2024-9-19", "Replaced");
doc.save("C:\\Temp\\out.docx");
Regarding text in the footer. I am afraid, there is no way to replace it as whole, since text is in two separate shapes:
1 Like