Aspose document creation logic is taking more than one minute when we have multiple replacement tokens

Hi,
We are new to aspose and doing some POC to achieve our requirement and if aspose supports our requiremnt we are planning to buy licence version.Basically we are looking at Aspose.Word for java to generate documents and conversion to PDF. Our requirement is to replace tokens contained in a Word document with content from a database. These replacement tokens are in the format “{r*}”. So the replacement token may again a document or a regular expression or a if else conditions etc and the child document may again can contain multiple child documents. So we have used IReplacingCallback to write custom replace method and finally somehow we are able to replace the token content but we are mainly facing two issues.

  1. We are not able to preserve the stylings of child documents
  2. The responsetime to generate final document after replacing all child documents with their replacement tokens is taking morethan 1minute.

Can you please help on this how we can try to reduce the response time.
I have gone through below links and implemented mailmerge also but that also taking same time 50s/1m.

Document.Range.Replace is incredibly slow - Free Support Forum - aspose.com

Sample base document will look like below and for each child document we need to get content from DB and inside that content again we may have multiple replacement tokens.

@arunavayyala Find/Replace is not he most efficient way to fill the template with data. I would suggest you to take a look into LINQ Reporting Engine direction. In this case you do not need to implement a custom callbacks to insert documents into your template, you can use the LINQ Syntax to insert documents. Also, the following switch <<doc [document_expression] -build>> allows to fill the inserted document with data if required.

If LINQ Reporting Engine approach is not acceptable for you, please attach your sample template, documents and code that will allow us to test the scenario on our side. We will check it and provide you more information.

Hi @alexey.noskov,

Thanks for your response. We are not familiar with this LINQ Reporting Engine approach. We will read the documentation and comeback to you. Does the LINQ reporting approach will only support document replacement ? Because As i mentioned earlier in our case replacement token is a document or a plain text or it may be a some conditional expression also and moreover how we can achieve recursion without calling replace method in this case ?

Sharing sample template again for reference.
image.png (107.6 KB)
BaseTemplate.docx (175.1 KB)

Thanks,
Aruna Vayyala.

@arunavayyala

Sure, you can insert simple text, documents, images or charts using LINQ Reporting engine. Please see our documentation for more information:
https://docs.aspose.com/words/java/outputting-expression-results/
You can also output sequential data:
https://docs.aspose.com/words/java/outputting-sequential-data/
And use conditional blocks:
https://docs.aspose.com/words/java/using-conditional-blocks/

As I have mentioned while inserting document with <<doc [document_expression] -build>>, when -build switch is used, a document being inserted is checked against template syntax and is populated with data.

Hi @alexey.noskov,
Can you please help me how to get all hyperlinks from baseLayout document using LINQ approach because all our replacing links will be there inside curly braces like {{ replacing text/doc/condition }} and also these are dynamic ? Since we are new to LINQ even though after i read multiple links still not able to start with this. The syntax is bit different and not able to understand.

Below is our existing approach :

  1. Read BaseLayout content from DB and create baseDocument.
  2. Get all replacing links which needs to be replaced with dynamic content using regular expression
  3. Iterate each link using replacing method and get content of each dynamic link and replace it in baselayout.
  4. If replacing link is a again document we need to call IReplacingCallback replace method recursively to replace nested documents content and hyperlinks.

screenshot-9 (1).png (73.2 KB)

screenshot-10.png (111.9 KB)

@arunavayyala You should use the following LINQ syntax to insert hyperlinks into the document:
<<link [uri_or_bookmark_expression] [display_text_expression]>>
Please see our documentation for more information:
https://docs.aspose.com/words/java/inserting-hyperlinks-dynamically/

For example see the following code:

// Build some basic template. In real case the template is loaded from a file.
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);

// Insert placeholder of a link
builder.write("<<link [url] [text]>>");

// Some data source.
String jsonData = "{ url : \"https://www.aspose.com\", text : \"Link To Aspose Website\" }";
JsonDataSource data = new JsonDataSource(new ByteArrayInputStream(jsonData.getBytes()));

// Build a report.
ReportingEngine engine = new ReportingEngine();
engine.buildReport(doc, data);

doc.save("C:\\Temp\\out.docx");

@alexey.noskov,
Thanks for code snippet but still we are not able to continue. Can you please verify my usecase and let us know the best solution.

I will be having base template document with some expressions like below and we are getting all these expressions by splitting baseTemplate content with some regular expression and iterating one by one and checking if expression is a document we are querying the DB with documentname to get document content and checking that content of child document have any expressions inside that and resolving these expressions and replacing with actualContent by using findandreplace(IReplacingCallback)
Expressions inside baseTemplate and inside childDocument look like below.

{{[child1.docx]}}  
{{[PRRRequired==true]}} 
{{[child2.docx]}}
{{[variable#PRRRequired!=null]}}

But here using LINQ approach for above requirement i got stuck how to get all expressions from baseTemplate and how to iterate each expression one by one and how to write my own methods to resolve expressions in LINQ approach. Here is sample snippet what we are doing currently.

  1. Get baselayout expressions and invoke IreplacingCallback recursively to resolve multiple expressions.
Document baseTemplate = new Document(byteArrayInputStream);
//baseAsposeDocument.joinRunsWithSameFormatting();
//DocumentBuilder documentBuilder = new DocumentBuilder(baseTemplate);
baseTemplate.save("myBaseLayout.docx", SaveFormat.DOCX);
// Define the regex pattern
String regex = "\\{\\{([^}]*)\\}\\}";
Pattern pattern = Pattern.compile(regex);
String documentText = baseAsposeDocument.getText();
Matcher matcher = pattern.matcher(documentText);
boolean patternExists = matcher.find();
while (patternExists)
{
    FindReplaceOptions options = new FindReplaceOptions();
    options.setDirection(FindReplaceDirection.FORWARD);
    options.setReplacingCallback(new ReplaceEvaluator(documentConfigDetailRepository, json, baseAsposeDocument, counterparty, entity, layout, jdbcTemplate));
    baseTemplate.getRange().replace(Pattern.compile("\\{\\{([^}]*)\\}\\}"), "", options);
    baseTemplate.save("out.docx", SaveFormat.DOCX);
    FieldCollection fields = baseTemplate.getRange().getFields();
    for (Field field : fields)
    {
        if (field.getType() == FieldType.FIELD_HYPERLINK)
        {
            field.unlink();
        }
    }
    baseTemplate.save("out.docx", SaveFormat.DOCX);
    documentText = baseTemplate.getText();
    matcher = pattern.matcher(documentText);
    patternExists = matcher.find();
}

Inside IReplacingCallback :

public int replacing(ReplacingArgs args){

if(args==child.docx){// DB call with document name to get content and replace}
if(args.contains("#")){//invoking custom method to check condition and if condition is true then call DB to get content and replace}

Note: child.docx content may again can contain multiple expressions and innerdocs inside and before replacing childDocument we need to resolve all child document expressions.These expressions may have again child document.So there is no count how many levels of subdocuments are there that is dynamic based on content inside each document.

@arunavayyala You can use Conditional Blocks right in your template and LINQ Reporting Engine will check the condition for you. The syntax is pretty simple:

<<if [conditional_expression1]>>
template_option1
<<elseif [conditional_expression2]>>
template_option2
...
<<else>>
default_template_option
<</if>>

So it is not required to iterate expressions in the template but simply configure the conditional blocks in the template and LINQ Reporting Engine will do the rest. What you are doing looks like your own implementation of the reporting engine.

@alexey.noskov Thank you for your reply.
In our case we don’t have conditional expressions in this format. We have our own syntax in our document. All expressions to be replaced are in curly braces.
Our document can have the following expressions:

  1. {{child1.docx}}
    if there is an expression like this, it needs to be replaced with the specific child document and the child document can have expressions in curly braces too.
  2. {{[Currency]}}
    if there is an expression like this, it needs to be replaced with the value of currency.
  3. {{[ts.ProductRiskRating.docx#PRRRequired=="Yes"]}}
    if there is a condition like this the above child document must be replaced in place of the placeholder only if PRRRequired==“Yes”. Here the # represents that there will be a conditional expression.
  4. {{Spec~Cond5=="Single"?"Single": "Basket"}}
    if else is represented like this.

Can we use LINQ approach to

  1. identify the expressions in curly braces recursively, even after child documents are inserted.
  2. replace these expressions according to the different syntax?

@arunavayyala

No, unfortunately, LINQ Reporting Engine cannot recognize curly brackets syntax. LINQ reporting template syntax is described in our documentation.

You can use Find/Replace functionality to replace curly braces syntax with LINQ reporting template syntax. But anyway it is better to inspect the resulting template manually to make sure the syntax is converted properly.

  1. {{child1.docx}} should be replaced with <<doc [child1.docx]>>
  2. {{[Currency]}} should be replaced with <<[Currency]>>
  3. {{[ts.ProductRiskRating.docx#PRRRequired=="Yes"]}} should be replaced with <<if [PRRRequired=="Yes"]>><<doc [ts.ProductRiskRating.docx]>><</if>>
  4. {{Spec~Cond5=="Single"?"Single": "Basket"}} should be replaced with <<if [Cond5=="Single"]>>Single<<else>>Basket<</if>>