Range.Replace inserting extra hex characters


#1

Hi,

Why does the Range.Replace method insert the hex character \x0000 between each character of the new string?

eg.
using Range.Replace(“Cat”, “Dog”) and viewing the word document through a hex editor before and after the change I see “Cat” changing to the equivalent of “\x0000 D \x0000 o \x0000 g \x0000”?

This isn’t a problem if you replacing text with text as viewing the document through Word the user doesn’t notice the difference. However, I am trying to replace text with hex characters and the extra \x0000 characters are causing me problems.

I am using an evaluation copy of Aspose.Word.

Thanks.
Chris


#2

Hi Chris,

Thank you for considering Aspose.

The point is that Aspose.Word writes all text as Unicode. Even if the document was not in Unicode, it gets written as Unicode.


#3

Is it possible to change or, probably safer, to overload the Range.Replace method to make text insertion as Unicode optional?


#4

We are having trouble understanding why this can be necessary. Please let us know more why do you care about Unicode/non unicode in this case and why you have to use hex viewer to examine the content of the document.


#5

I have a requirement to produce a web app that allow users to create template word documents through the browser. Currently my web app marks the fields to be replaced by data from the datasource using the {{…}} format. I have something working which replicates the behaviour of MailMerge.Execute that creates my word documents.

However, trying to replicate the RemoveEmptyParagragh behaviour is getting messy as there are numerous combinations the user can supply that I need to handle and you can guarantee it won’t take long for somone to create a template document using fields in a way I hadn’t thought of.

So I decided to try and use Range.Replace to replace my {{…}} fields with real MailMerge fields.

e.g.
// strip off the surrounding {{ }} characters of the {{…}} to get the
// column name

colName = myMatch.ToString().Substring(2, myMatch.ToString().Length - 4);

newValue = “\x0013” + " MERGEFIELD " + colName + " \* MERGEFORMAT " + “\x0014” + colName + “\x0015”;

oldValue = myMatch.ToString();
dstDoc.Range.Replace(oldValue, newValue, true);

Obviously this does not work since, as you kindly pointed out, Range.Replace replaces the text with unicode. I was using the hex editor to try and understand what I should replace the {{…}} with to get a MailMerge field and to then understand why it wasn’t working.

Chris


#6

It is not working not because of the extra zero bytes and Unicode.

It is not working because you cannot insert a field like this. A field in a document is slightly more than just a string, it also has some internal data structures associated with it.

At the moment the only way to insert a valid field into a document is to use DocumentBuilder.InsertField.

Sorry it is hard or impossible to use Aspose.Word for your task right now. We are working hard to allow easy document content manipulation like in MS Word object model. So hopefully some time in the future it will be able to do what you want.

At the moment your only options in Aspose.Word are bookmarks, merge fields, MailMerge and DocumentBuilder.


#7

OK - thanks for your quick reply.

Can I do something similar with bookmarks? i.e. Range.Replace to replace {{…}} fields with bookmarks, then use DocumentBuilder.MoveToBookmark and DocumentBuilder.InsertField to insert the MailMerge fields. Or do bookmarks also require some internal data structures?

If not, I’ll just keep to what I have at moment. Although not ideal, it does work.


#8

Unfortunately, bookmarks are just represented with internal structures. So it’s impossible to use Range.Replace to insert them.