Run.remove() adds/leaves an empty line in some cases

Greetings,

We recently made the step to update our outdated “Aspose.Words for Java” 17.4 jdk16 to the latest version. So far every conflict could be solved apart from one where we request your assistance:

Summary of our use case:
We have an application that replaces textmarkers with data or functionality.
For example when there is a <<pagebreak>> String within the document, we delete this marker and replace it with a page break in word. But it could be also text or tables or graphic for other textmarkers.
For this purpose we use the Document.getRange().replace(...) API to search for the text marker and use the IReplacingCallback API to implement our own replacing logic in IReplacingCallback.replacing(ReplacingArgs args) API.
At this point we have the textmarkers within the ReplacingArgs and load them into a List of com.aspose.words.Run objects. We move to the Run via the DocumentBuilder.move() API and insert the functionality(page break or text) before the Run. After we inserted whatever the textmarker is defined for, we delete the marker via Run.remove().

Error Description:
In our old aspose Version Run.remove() deleted the textmarker successfully. In the latest version it only works correctly, when we replace the marker with text. When we replace it with a Page Break or an empty string then an empty line is added/remains. The consequence is, that in our page break example there is now always an empty line at the beginning of the page, which is not good.

We tried different aspose versions and found out, that the behaviour changed with the Aspose.Words for Java version 21.11 and still persists with the latest 25.3 version.
As mentioned before we have used the version 17.4 before which worked as intended.

V17(Everything working as intendet):
Page Break example:
Input


Output

As you can see the marker <<pagebreak>> successfully gets replaced by a page break(Seitenumbruch in German).

Text Replace example
Input


Output

As you can see the marker <<ReplaceTextExample>> successfully gets replaced by a text.

V21.11 and later(Only Text Replace is correct):
Page Break example:
Input same as in V17
Output


Note that empty line that we have here now at the beginning of the document. This shouldn’t belong here

Text Replace example
Input same as in V17
Output


Works correct!

Steps for reproduction

I prepared an example program to allow you to reproduce and see the issue for yourself. Please have a look at the archive file “pageBreakBug.zip” (zipped without Aspose libraries, so you would need to add them yourself).
pageBreakBug.zip (24,7 KB)

You find comments within the project that explain the test case. We have two input files, one for the page break replace and one for the text replace. Both files are read in, replaced and saved as new output files with the result. You should get the same result as I explained above with the images.
Note that you have to comment-in/comment-out the correct method in line 37 and 40, because only one works in each version.
The change in behaviour probably occurs in line 78 run.remove() which adds/leaves a blank line with the latest versions.

Effect on our customers

Our customers present us with predefined templates and expect the result to behave exactly like in Microsoft Word. The example input file I provided is a much-simplified version of an actual customer template and mirrors its issues. As soon as a report is generated with Aspose, the aforementioned error happens and needs to be manually corrected every single time (which is not even possible at all times, since the document will in some cases be saved as .pdf). Since this is supposed to be an automatic solution, using the report like this would lead to a lot of customer dissatisfaction on our side.

Solution or workaround possible?

Currently we are working on a workaround because we cannot deploy the aspose update into production with this issue. Do you know any “clean” solutions without having to write too many lines of code? We are trying to delete the leftover line manually, but this is not too easy since we can’t just delete it in any case(e.g. in the text replace example it would be fatal). Is there even some configuration option somewhere in the Range class or Run class that we can use to have the same behaviour as in the V17 version?
Or is the described behaviour an actual bug in Aspose and I can expect it to be corrected soon in future builds?

I would be grateful for ideas and a solution to my problem. Thank you in advance!

Best regards,
Pascal Seidel

BMS Corporate Solutions GmbH/Atruvia AG

@bmsAsposeUser

Hello Pascal,

Thank you for reaching out regarding the issue you’re experiencing with the Run.remove() method in the latest versions of Aspose.Words for Java. It seems that the behavior has changed since version 21.11, leading to the addition of an empty line when replacing certain markers, such as page breaks or empty strings.

Understanding the Issue

In your previous version (17.4), removing a Run object that contained a text marker worked as expected without leaving any empty lines. However, in the newer versions, it appears that when you replace a marker with a page break or an empty string, an unwanted empty line is introduced at the beginning of the document.

Possible Workarounds

While there may not be a direct configuration option to revert to the old behavior, here are a few potential workarounds you could consider:

  1. Manual Removal of Empty Lines: After performing the replacement, you could programmatically check for and remove any empty paragraphs that may have been introduced. This can be done by iterating through the paragraphs in the document and removing any that are empty.

    Document doc = new Document("yourDocument.docx");
    for (Paragraph para : doc.getChildNodes(NodeType.PARAGRAPH, true)) {
        if (para.getText().trim().isEmpty()) {
            para.remove();
        }
    }
    doc.save("output.docx");
    
  2. Adjusting the Replacement Logic: If you are using a custom IReplacingCallback, you might want to ensure that the logic correctly handles the case where a page break or empty string is being inserted. You could modify the callback to check the context and decide whether to remove the preceding empty line.

  3. Using Legacy Order: If applicable, you could try using the legacy order for replacements, which might help in maintaining the expected behavior. This can be set in the FindReplaceOptions.

    FindReplaceOptions options = new FindReplaceOptions();
    options.setUseLegacyOrder(true);
    

Reporting the Issue

Since this behavior change could be considered a regression, I recommend reporting it to Aspose’s support team if you haven’t already. Providing them with a minimal reproducible example, as you have done, will help them understand the issue better and potentially address it in future updates.

Conclusion

I hope these suggestions help you mitigate the issue until a more permanent solution is provided. If you have any further questions or need additional assistance, feel free to ask.

Best regards!

@bmsAsposeUser It looks like the behavior has been changed after WORDSNET-22834, which was included in 21.11 version of Aspose.Words. Now, Aspose.Words adds a paragraph break after the page break depending from the compatibility options, i.e. if DocumentBuilder cursor is at the beginning of paragraph and SplitPgBreakAndParaMark compatibility option is set or MS Word version is >= MS Word 2013.

You can use the following code:

localBuilder.write(ControlChar.PAGE_BREAK);

instead of

localBuilder.insertBreak(BreakType.PAGE_BREAK);

To restore old behavior.

1 Like