Backslash issue in reading merge fields

Hello,

When using aspose words to read, using DocumentBuilder, a .docx document containing merge fields, the DocumentBuilder.getParagraph().getText() shows that it reads more than it should, if the merge field contains the backslash “” character. While normally I can get only the field, in this case, it reads about two rows at once.

So, if the merge field contains “#1” (without the quotes), the DocumentBuilder reads more content and it messes up my business logic. If, in the merge field, I have “#1”, everything works fine.
The problem is that, in my case, there can be regular expressions inside merge fields, and they use backslash intensively.

I work with Java, Aspose Words version 17.7, and the line which reads the document is like this:

for(final Paragraph para: (Iterable) doc.getChildNodes(NodeType.PARAGRAPH,true)),

where the doc is a com.aspose.words.Document object.

What could I do about this?

@z3n1th,

Have you also tried the latest version of Aspose.Words for Java i.e. 19.9 on your end? In case the problem still remains, please ZIP and upload your input Word document and piece of source code causing this problem here for testing. We will then investigate the issue on our end and provide you more information.

It’s a larger project, and I’m not managing the dependencies for it, but the newest version I can test with is Aspose Words 18.5, and the problem is present in that version too.

@z3n1th,

Please provide a simplified input Word document a standalone simple Console application (source code without compilation errors) that helps us to reproduce your current problem on our end and attach it here for testing. Please do not include Aspose.Words DLL files in it to reduce the file size. As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

Hi,

I managed to make a simple app which shows the issue.
Even though it looks like it is not backslash dependent, like in the project I work on, problem presents similarly, in the way that, when I position the DocumentBuilder on a merge field, it takes more text than just the merge field.
I commented out the lines which replace the merge fields, and save the new file, but if you uncomment them and run the main class, you’ll see that the text is entered not where the respective merge fields are, but after the two lines. This is the problem I have actually, that the replaced text is not positioned correctly, and I don’t know why, since in general this replacement works correctly.

In the console, on the lines starting with “Builder on field”, I would expect to see only the value of the field, not two lines of text.
backslash.zip (13.5 KB)

Later edit: it seems that indeed the issue is not connected to backslashes, it appears without them too

@z3n1th,

Please check if the following simple code helps in achieving what you are looking for?

Document doc = new Document("E:\\temp\\backslash\\backslash.docx");

for(Field field :(Iterable<Field>)doc.getRange().getFields() ) {
    if (field.getType() == (FieldType.FIELD_MERGE_FIELD)){
        FieldMergeField mf = (FieldMergeField) field;
        System.out.println(mf.getFieldCode());
        // System.out.println(mf.getDisplayResult());
    }
}