ReplaceArgs.setReplacement("") is leaving blank line in document generation

@priyanka9 You should not use both because ReplaceArgs.getMatchNode().getDocument().getRange().replace(arg,replacementText) changes the structure of the document, that might cause problems in further processing of the current replacement process.

ho ok got it thank you @alexey.noskov.

1 Like

Hi @alexey.noskov ,

As i informed earlier now we are facing styling issue when we have data like below when we use BACKWARD direction.
Scenario : When we are using find and replace BACKWARD direction if we have data like below we are missing styling for child nodes.
Inputs :
input.docx (45.5 KB)

So when document have data like below (last section in input.docx)


attached some more samples for better understanding :

we are taking styling from {{ts.Annex1_TableStyle.docx}} argument and storing it in Document object and applying that styling for all below arguments. This scenario is working fine for FARWARD direction but when it comes to BACKWARD direction since replacement will start from bottom and styling argument is there on top styling is not getting applied and styling completely gone when we tried to bring down that styling argument it worked as expected but that we cant do same in production data.

Actual output :
actual_output.docx (50.7 KB)

Expected output :

We are using insertDocument method to apply the style for child nodes.You can refer below attachment for that code.

aspose_test.zip (200.1 KB)

Can you please check and suggest us how we can fix this issue ?

Thanks,
Priyanka.

Hi,

Appreciate if we get quick response.

Thanks,
Priyanka.

@priyanka9 If it is critical to use forward direction, please try replacing all placeholder in your document so they were represented as a single Run node. Please try using the following code:

// Replace placeholders in the document tp make them to be represented as a single run.
FindReplaceOptions tmpOptions = new FindReplaceOptions();
tmpOptions.setUseSubstitutions(true);
baseAsposeDocument.getRange().replace(Pattern.compile("\\{\\{([^}]*)\\}\\}"), "$0", tmpOptions);
                
FindReplaceOptions options = new FindReplaceOptions();
options.setDirection(FindReplaceDirection.FORWARD);
// passing required args as per our project need u can ignore all these args
options.setReplacingCallback(new ReplaceEvaluatorTest());
baseAsposeDocument.getRange().replace(Pattern.compile("\\{\\{([^}]*)\\}\\}"), "", options);
FieldCollection fields = baseAsposeDocument.getRange().getFields();
baseAsposeDocument.save("C:\\Temp\\finalAsposeDoc.docx", SaveFormat.DOCX);

Hi @alexey.noskov ,

Thanks for your rly but as i said earlier we can not make any changes on the document data since that is coming from another system. Is there any other solution is there for this ?

@priyanka9 I do not propose to modify your documents. I propose a little preprocessing of the template before building the final report, i.e. the following three lines of code:

// Replace placeholders in the document tp make them to be represented as a single run.
FindReplaceOptions tmpOptions = new FindReplaceOptions();
tmpOptions.setUseSubstitutions(true);
baseAsposeDocument.getRange().replace(Pattern.compile("\\{\\{([^}]*)\\}\\}"), "$0", tmpOptions);

This code replaces the placeholders with themselves and after such preprocessing each placeholder will be represented as as single Run node.

ok thanks @alexey.noskov let me try and come back.

1 Like

Hi @alexey.noskov ,

I tried with above code but still same issue styling is not able to apply for child nodes when we use backward direction.

Can you please suggest us some solution.

Thanks,
Priyanka.

@priyanka9 With the above suggested workaround you should use forward direction, not backward direction.

Hi @alexey.noskov ,

If we use farward direction removeArgs logic is causing issue as i mentioned earlier.

@priyanka9 Then, maybe the easier approach for you will be to avoid using IReplacingCallback and instead process the runs with placeholders. The above mentioned workaround makes the placeholders to be represented as a single Run node. So you can use the following approach:

Pattern pattern = Pattern.compile("\\{\\{([^}]*)\\}\\}");
// Replace placeholders in the document to make them to be represented as a single run.
FindReplaceOptions tmpOptions = new FindReplaceOptions();
tmpOptions.setUseSubstitutions(true);
baseAsposeDocument.getRange().replace(pattern, "$0", tmpOptions);
// Find the runs with placeholders.
for (Run r : (Iterable<Run>)baseAsposeDocument.getChildNodes(NodeType.RUN, true))
{
    if (pattern.matcher(r.getText()).find())
    {
        // Process the matched Run as it is required.
    }
}

Hi @alexey.noskov ,

This approach wont work for our requirement since document will have multiple level of inner documents inside and which have different args like documents or plain text or tables or styling args etc.

Even LINQ approach also wont suite for our requirement and we tried POC initially since it doesnt suite our requirement we went with IReplacingCallback.

@priyanka9 This is actually the same approach, but processing of the matched tags is performed outside the IReplacingCallback.

Actually LINQ Reporting Engine can process sub reports like it is required in your case. in your case you insert subdocuments and these subdocuments also can contain tags, which should be replaced. This can be easily done using using <<doc [document_expression] -build>> tag:
https://docs.aspose.com/words/java/inserting-documents-dynamically/

Find/Replace functionality is not actually designed for complex report generation and document processing upon replacing process.

Ok let us go through once LINQ approach and will comeback again. Actually we are in a stage to move to production but now changing approach is not good idea and again rework also for us but anyway if it will resolves all our problems will check and comeback.
Can you just conform us,

  • Will it support our styling functionality also right ? as i said if parent node is styling argument we need to take styling from parent node and apply it for all child nodes.

  • While replacing any conditional argument if argument doesnt support condition we need to remove that argument from document position without breaking any stylings is it possible in LINQ approach ?

  • As of now we are working with word document but going forward we may have pdf,excel,and graphs bar charts etc will it support for all these things ?

Please let us know will it support all above requirements and is it a good idea to change the approach now ?

Thanks,
Priyanka.

@priyanka9

  1. You can use nested reports to achieve styling functionality. For example in your main template there will be tag like this:

<<doc [subreport] -build>>

where subreport is the document with styling information. subreport contains the similar syntax:

<<doc [actualdata]>>

where actualdata is a document with actual data.

  1. In LINQ Reporting Engine you can configure conditions how it is required.
    https://docs.aspose.com/words/java/using-conditional-blocks/

  2. I am afraid, LINQ Reporting Engine is only available in Aspose.Words

Regarding your current approach. I think it is required to reduce the code complexity to make the code less error prone. Currently, in IReplacingCallback you split nodes. With the workaround proposed earlier, it is not required to split nodes since after performing the following code:

Pattern pattern = Pattern.compile("\\{\\{([^}]*)\\}\\}");
// Replace placeholders in the document to make them to be represented as a single run.
FindReplaceOptions tmpOptions = new FindReplaceOptions();
tmpOptions.setUseSubstitutions(true);
baseAsposeDocument.getRange().replace(pattern, "$0", tmpOptions);

Placeholders are already represented a a single node and you can simply use args.MatchNode. For example the following code demonstrates the technique:

Document doc = new Document("C:\\Temp\\in.docx");

Pattern pattern = Pattern.compile("\\{\\{([^}]*)\\}\\}");
// Replace placeholders in the document to make them to be represented as a single run.
FindReplaceOptions tmpOptions = new FindReplaceOptions();
tmpOptions.setUseSubstitutions(true);
doc.getRange().replace(pattern, "$0", tmpOptions);

// Use actual find replace options.
FindReplaceOptions opt = new FindReplaceOptions();
opt.setDirection(FindReplaceDirection.FORWARD);
opt.setReplacingCallback(new ReplaceWithDocumentCallback());
doc.getRange().replace(pattern, "C:\\Temp\\test.docx", opt);

doc.save("C:\\Temp\\out.docx");
private static class ReplaceWithDocumentCallback implements IReplacingCallback
{
    @Override
    public int replacing(ReplacingArgs args) throws Exception {
            
        Node matchNode = args.getMatchNode();
        Document doc = (Document)matchNode.getDocument();
        DocumentBuilder builder = new DocumentBuilder(doc);
            
        // Move to match node and insert document.
        builder.moveTo(matchNode);
        builder.insertDocument(new Document(args.getReplacement()), ImportFormatMode.KEEP_SOURCE_FORMATTING);
            
        // Remove match node.
        matchNode.remove();
            
        return ReplaceAction.SKIP;
    }
}

If use old approach with splitting the matched nodes in IReplacingCallback and do not use the proposed workaround, the code throws NullPointerException, just like the problem you have encountered. The exception is not thrown if use the workaround using both new and old approaches. So to resolve your current problem with FORWARD direction, you should simply use the proposed workaround. In addition implementation of IReplacingCallback can be simplified like proposed above.

hi @alexey.noskov

thanks for the suggestion and we have try to make the same change but issue is not resolved. Please take a look at the sample project as we manage to reproduce the same error when removing the args either using the matchNode.remove() or the removeArgs() method.

aspose_test.zip (142.2 KB)


The parent was null happens whenever we get this type args.

@priyanka9 Thank you for addition information, but as I can see you did not implement the suggested workaround in your code. Please modify your code like this:

public static void main(String[] args) {
    try {
        Document baseAsposeDocument = new Document("C:\\Temp\\tmp\\input1.docx");
        String regex = "\\{\\{([^}]*)\\}\\}";
        Pattern pattern = Pattern.compile(regex);
        String documentText = baseAsposeDocument.getText();
        Matcher matcher = pattern.matcher(documentText);
        boolean patternExists = matcher.find();
        while (patternExists) {

// THIS IS START OF THE WORKAROUND -------------------------------------------
            // Replace placeholders in the document to make them to be represented as a single run.
            FindReplaceOptions tmpOptions = new FindReplaceOptions();
            tmpOptions.setUseSubstitutions(true);
            baseAsposeDocument.getRange().replace(Pattern.compile("\\{\\{([^}]*)\\}\\}"), "$0", tmpOptions);
// THIS IS END OF THE WORKAROUND -------------------------------------------

            FindReplaceOptions options = new FindReplaceOptions();
            options.setDirection(FindReplaceDirection.FORWARD);
            options.setUseSubstitutions(true);
            // passing required args as per our project need u can ignore all these args
            options.setReplacingCallback(new ReplaceEvaluatorTest());
            baseAsposeDocument.getRange().replace(Pattern.compile("\\{\\{([^}]*)\\}\\}"), "", options);
            FieldCollection fields = baseAsposeDocument.getRange().getFields();
            baseAsposeDocument.save("C:\\Temp\\finalAsposeDoc.docx", SaveFormat.DOCX);
            for (Field field : fields) {
                if (field.getType() == FieldType.FIELD_HYPERLINK) {
                    field.unlink();
                }
            }
            baseAsposeDocument.save("C:\\Temp\\finalAsposeDoc.docx", SaveFormat.DOCX);
            documentText = baseAsposeDocument.getText();
            matcher = pattern.matcher(documentText);
            patternExists = matcher.find();

        }
        System.out.println("Aspose test_got_completed,Please verify the finalAsposeDoc.docx file by reloading the project");

    }  catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Hi @alexey.noskov ,
Sorry for late response working on some other stuff and back to this issue.
Actually after trying lot of variations finally testing with below code since its not breaking stylings.
When argument doesn’t satisfy any condition since we need to remove using below code.

Paragraph  para = (Paragraph)ReplacingArgs.getMatchNode().getParentNode();
para.remove() ;

It is working fine without breaking any styles with FARWARD direction but the problem is when two consecutive arguments which are not satisfying condition these two needs to be remove from document but in this scenario the above code throwing error like
“IllegalStateException cannot remove because there is no parent”
Ex :

I thnks two args are coming in same paragraph it seems.
So I changed code to like below.

Paragraph para = (Paragraph)ReplacingArgs.getMatchNode().getParentNode();
para.remove();
if (para != null && para.getParentNode() != null)
{ para.remove(); }
else
{
    Paragraph paragraph = (Paragraph)args.getMatchNode();
    if (paragraph != null)
    { NodeCollection<Node> nodes = paragraph.getDocument().getChildNodes(); nodes.remove(para); }
    else
    { System.out.println("Cannot remove the paragraph because it is null."); }
}
return ReplaceAction.SKIP;

But still this code also thowing error . Can you please tell me how i can remove paragraph when parentnode is null ?
My goal is to remove the paragraph regardless of whether it has a parent or not.

@priyanka9 If the node’s parent node is null, this means that not is already removed or is not inserted in the document model. So there is no way and sense to remove such node since it is not in the document. You should add a condition to check whether node’s parent node is not null before removing it.