ReplaceArgs.setReplacement("") is leaving blank line in document generation

Hi @alexey.noskov ,

I tried with above code but still same issue styling is not able to apply for child nodes when we use backward direction.

Can you please suggest us some solution.

Thanks,
Priyanka.

@priyanka9 With the above suggested workaround you should use forward direction, not backward direction.

Hi @alexey.noskov ,

If we use farward direction removeArgs logic is causing issue as i mentioned earlier.

@priyanka9 Then, maybe the easier approach for you will be to avoid using IReplacingCallback and instead process the runs with placeholders. The above mentioned workaround makes the placeholders to be represented as a single Run node. So you can use the following approach:

Pattern pattern = Pattern.compile("\\{\\{([^}]*)\\}\\}");
// Replace placeholders in the document to make them to be represented as a single run.
FindReplaceOptions tmpOptions = new FindReplaceOptions();
tmpOptions.setUseSubstitutions(true);
baseAsposeDocument.getRange().replace(pattern, "$0", tmpOptions);
// Find the runs with placeholders.
for (Run r : (Iterable<Run>)baseAsposeDocument.getChildNodes(NodeType.RUN, true))
{
    if (pattern.matcher(r.getText()).find())
    {
        // Process the matched Run as it is required.
    }
}

Hi @alexey.noskov ,

This approach wont work for our requirement since document will have multiple level of inner documents inside and which have different args like documents or plain text or tables or styling args etc.

Even LINQ approach also wont suite for our requirement and we tried POC initially since it doesnt suite our requirement we went with IReplacingCallback.

@priyanka9 This is actually the same approach, but processing of the matched tags is performed outside the IReplacingCallback.

Actually LINQ Reporting Engine can process sub reports like it is required in your case. in your case you insert subdocuments and these subdocuments also can contain tags, which should be replaced. This can be easily done using using <<doc [document_expression] -build>> tag:
https://docs.aspose.com/words/java/inserting-documents-dynamically/

Find/Replace functionality is not actually designed for complex report generation and document processing upon replacing process.

Ok let us go through once LINQ approach and will comeback again. Actually we are in a stage to move to production but now changing approach is not good idea and again rework also for us but anyway if it will resolves all our problems will check and comeback.
Can you just conform us,

  • Will it support our styling functionality also right ? as i said if parent node is styling argument we need to take styling from parent node and apply it for all child nodes.

  • While replacing any conditional argument if argument doesnt support condition we need to remove that argument from document position without breaking any stylings is it possible in LINQ approach ?

  • As of now we are working with word document but going forward we may have pdf,excel,and graphs bar charts etc will it support for all these things ?

Please let us know will it support all above requirements and is it a good idea to change the approach now ?

Thanks,
Priyanka.

@priyanka9

  1. You can use nested reports to achieve styling functionality. For example in your main template there will be tag like this:

<<doc [subreport] -build>>

where subreport is the document with styling information. subreport contains the similar syntax:

<<doc [actualdata]>>

where actualdata is a document with actual data.

  1. In LINQ Reporting Engine you can configure conditions how it is required.
    https://docs.aspose.com/words/java/using-conditional-blocks/

  2. I am afraid, LINQ Reporting Engine is only available in Aspose.Words

Regarding your current approach. I think it is required to reduce the code complexity to make the code less error prone. Currently, in IReplacingCallback you split nodes. With the workaround proposed earlier, it is not required to split nodes since after performing the following code:

Pattern pattern = Pattern.compile("\\{\\{([^}]*)\\}\\}");
// Replace placeholders in the document to make them to be represented as a single run.
FindReplaceOptions tmpOptions = new FindReplaceOptions();
tmpOptions.setUseSubstitutions(true);
baseAsposeDocument.getRange().replace(pattern, "$0", tmpOptions);

Placeholders are already represented a a single node and you can simply use args.MatchNode. For example the following code demonstrates the technique:

Document doc = new Document("C:\\Temp\\in.docx");

Pattern pattern = Pattern.compile("\\{\\{([^}]*)\\}\\}");
// Replace placeholders in the document to make them to be represented as a single run.
FindReplaceOptions tmpOptions = new FindReplaceOptions();
tmpOptions.setUseSubstitutions(true);
doc.getRange().replace(pattern, "$0", tmpOptions);

// Use actual find replace options.
FindReplaceOptions opt = new FindReplaceOptions();
opt.setDirection(FindReplaceDirection.FORWARD);
opt.setReplacingCallback(new ReplaceWithDocumentCallback());
doc.getRange().replace(pattern, "C:\\Temp\\test.docx", opt);

doc.save("C:\\Temp\\out.docx");
private static class ReplaceWithDocumentCallback implements IReplacingCallback
{
    @Override
    public int replacing(ReplacingArgs args) throws Exception {
            
        Node matchNode = args.getMatchNode();
        Document doc = (Document)matchNode.getDocument();
        DocumentBuilder builder = new DocumentBuilder(doc);
            
        // Move to match node and insert document.
        builder.moveTo(matchNode);
        builder.insertDocument(new Document(args.getReplacement()), ImportFormatMode.KEEP_SOURCE_FORMATTING);
            
        // Remove match node.
        matchNode.remove();
            
        return ReplaceAction.SKIP;
    }
}

If use old approach with splitting the matched nodes in IReplacingCallback and do not use the proposed workaround, the code throws NullPointerException, just like the problem you have encountered. The exception is not thrown if use the workaround using both new and old approaches. So to resolve your current problem with FORWARD direction, you should simply use the proposed workaround. In addition implementation of IReplacingCallback can be simplified like proposed above.

hi @alexey.noskov

thanks for the suggestion and we have try to make the same change but issue is not resolved. Please take a look at the sample project as we manage to reproduce the same error when removing the args either using the matchNode.remove() or the removeArgs() method.

aspose_test.zip (142.2 KB)


The parent was null happens whenever we get this type args.

@priyanka9 Thank you for addition information, but as I can see you did not implement the suggested workaround in your code. Please modify your code like this:

public static void main(String[] args) {
    try {
        Document baseAsposeDocument = new Document("C:\\Temp\\tmp\\input1.docx");
        String regex = "\\{\\{([^}]*)\\}\\}";
        Pattern pattern = Pattern.compile(regex);
        String documentText = baseAsposeDocument.getText();
        Matcher matcher = pattern.matcher(documentText);
        boolean patternExists = matcher.find();
        while (patternExists) {

// THIS IS START OF THE WORKAROUND -------------------------------------------
            // Replace placeholders in the document to make them to be represented as a single run.
            FindReplaceOptions tmpOptions = new FindReplaceOptions();
            tmpOptions.setUseSubstitutions(true);
            baseAsposeDocument.getRange().replace(Pattern.compile("\\{\\{([^}]*)\\}\\}"), "$0", tmpOptions);
// THIS IS END OF THE WORKAROUND -------------------------------------------

            FindReplaceOptions options = new FindReplaceOptions();
            options.setDirection(FindReplaceDirection.FORWARD);
            options.setUseSubstitutions(true);
            // passing required args as per our project need u can ignore all these args
            options.setReplacingCallback(new ReplaceEvaluatorTest());
            baseAsposeDocument.getRange().replace(Pattern.compile("\\{\\{([^}]*)\\}\\}"), "", options);
            FieldCollection fields = baseAsposeDocument.getRange().getFields();
            baseAsposeDocument.save("C:\\Temp\\finalAsposeDoc.docx", SaveFormat.DOCX);
            for (Field field : fields) {
                if (field.getType() == FieldType.FIELD_HYPERLINK) {
                    field.unlink();
                }
            }
            baseAsposeDocument.save("C:\\Temp\\finalAsposeDoc.docx", SaveFormat.DOCX);
            documentText = baseAsposeDocument.getText();
            matcher = pattern.matcher(documentText);
            patternExists = matcher.find();

        }
        System.out.println("Aspose test_got_completed,Please verify the finalAsposeDoc.docx file by reloading the project");

    }  catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Hi @alexey.noskov ,
Sorry for late response working on some other stuff and back to this issue.
Actually after trying lot of variations finally testing with below code since its not breaking stylings.
When argument doesn’t satisfy any condition since we need to remove using below code.

Paragraph  para = (Paragraph)ReplacingArgs.getMatchNode().getParentNode();
para.remove() ;

It is working fine without breaking any styles with FARWARD direction but the problem is when two consecutive arguments which are not satisfying condition these two needs to be remove from document but in this scenario the above code throwing error like
“IllegalStateException cannot remove because there is no parent”
Ex :

I thnks two args are coming in same paragraph it seems.
So I changed code to like below.

Paragraph para = (Paragraph)ReplacingArgs.getMatchNode().getParentNode();
para.remove();
if (para != null && para.getParentNode() != null)
{ para.remove(); }
else
{
    Paragraph paragraph = (Paragraph)args.getMatchNode();
    if (paragraph != null)
    { NodeCollection<Node> nodes = paragraph.getDocument().getChildNodes(); nodes.remove(para); }
    else
    { System.out.println("Cannot remove the paragraph because it is null."); }
}
return ReplaceAction.SKIP;

But still this code also thowing error . Can you please tell me how i can remove paragraph when parentnode is null ?
My goal is to remove the paragraph regardless of whether it has a parent or not.

@priyanka9 If the node’s parent node is null, this means that not is already removed or is not inserted in the document model. So there is no way and sense to remove such node since it is not in the document. You should add a condition to check whether node’s parent node is not null before removing it.

Yes but when i added that conditon the current paragrapgh which needs to be removed its still exist.

@priyanka9 If the node’s parent node is null, this means that node is already removed. So I do not think this condition causes the problem.

yeah u r correct after adding the check if(parentNode!=null) para.remove(); its not giving that error but its removing entire paragraph which have multiple epressions and before resolving them its removing all of them.

For Ex : In below screenshot the highlighted two arguments are coming from same paragraph and when aspose find and replace algorithm run for first argument which is there in first curly braces({{ }}) its trying to remove both.

@priyanka9 Could you please create a simple console application that will allow us to reproduce the problem and provide the required documents? We will check the issue and provide you more information.

ok sure will comeback with sample project.

1 Like

Hi @alexey.noskov ,

I have created the sample project for the above scenario like where in same paragraph getting two args and when we are trying to replace first arg and if you call below code since its removing entire paragraph so next argument also getting removed.
Sample Project :

aspose_test.zip (297.8 KB)

input docuemnt which is used in project :
out-772.docx (51.5 KB)

Issue is causing when we have two args in same para like below :

Erro :

Please check and help on us I really need some way to fix this issue. Appreciate if i get quick response with fix.

Note : Inside document in one paragraph if one argument is coming its working fine but when para have multiple arguments since at first argument level we removed entire para its causing issue.

Thanks,
Priyanka.

@priyanka9 Thank you for additional information. In your code you remove paragraph that contains several matches. So when the second match is processed you are trying to remove the same paragraph. If it is required to simply remove the placeholder, you should set replacement to empty string instead of remove whole paragraph.

@Override
public int replacing(ReplacingArgs args) throws Exception {
    String arg = args.getMatch().group();
    arg= removeBrackets(arg);
    //assuming whatever the args are there in list are not satisfying condition and trying to remove them
    if(list.contains(arg)){
        args.setReplacement("");
        return ReplaceAction.REPLACE;
    } else {
        args.setReplacement("ReplacemntValue is:"+arg);
        return ReplaceAction.REPLACE;
    }
}

Hi @alexey.noskov ,

I think our conversation again came to first level as we mentioned in this post title args.setReplacement(“”) is giving empty paragrapgh in document.

See as of now we are setting empty only to fix the above error but that is giving empty paragraph in document that we dont want to show in document. So instead of setting empty we need to remove that match.

I just want some solution if paragraph contains multiple matches and when i try to replace each match if any of the match doesnt safisfy condition i need to remove that match from that para and need to proceed to next match.