High Memory Utilization During Document Comparison

Description :
We are experiencing a significant issue with high memory utilization when performing document comparisons using Aspose.Words. The memory usage spikes to 15-16 GB, causing the process to stall. This occurs specifically when comparing two documents, each containing 150-200 pages with numerous changes.

Aspose version:22.4.0

Code Example :

public Document Compare(Document original, Document modified)
{
    // Accept all revisions in both documents
    original.AcceptAllRevisions();
    modified.AcceptAllRevisions();

    // Compare the documents and store the changes in the original document
    original.Compare(modified, Author, DateTime.Now, compareOptions);

    return original;
}

Steps to Reproduce:

  1. Create a document comparison scenario with two documents, each containing 150-200 pages and numerous changes.
  2. Call the Compare method as shown in the code example.
  3. Observe the memory utilization during the comparison process.

Observations:

  • The memory usage reaches 15-16 GB when performing the compare operation.
  • Removing the compare line significantly reduces memory usage to around 500-600 MB.
  • Adding the following comparison options reduces memory usage to 10-12 GB but does not resolve the issue entirely:
new CompareOptions
{
    IgnoreFormatting = true,
    IgnoreCaseChanges = true,
    IgnoreComments = true,
    IgnoreTables = true,
    IgnoreFields = true,
    IgnoreFootnotes = true,
    IgnoreTextboxes = true,
    IgnoreHeadersAndFooters = true
};

Request :
Need assistance in optimizing the memory usage during the document comparison process. Any guidance on reducing memory consumption or alternative approaches to handle large documents with numerous changes would be highly appreciated.

@ashiqshanavas Could you please attach the problematic documents here for testing? We will check the issue and provide you more information.

GeneratedDocument.docx (363.5 KB)

GeneratedDocument2.docx (364.0 KB)

Actually we have encountered the issue with real docs which are confidential, that we cannot send it, but we are able to replicate the issue with the above two documents.

@ashiqshanavas I do not see high memory usage upon comparing the attached documents. But comparison of them takes about 3 minutes on my side. This is too slow.

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-27121

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

@alexey.noskov Just wanted to check which version of aspose where you using while testing this.

We are currently using Aspose version:22.4.0

@ashiqshanavas I used the latest 24.6 version of Aspose.Words for testing.

@alexey.noskov thanks for confirming.

I was able to generate most of the documents with this new version.

But for one of the document, i am getting below error. Do you know what could be the issue in here?

System.InvalidOperationException: NC sync failed.
at oP.n(Int32 d)
at oP.c()
at VC.v()
at VC.d(Node d, Node v, RP c)
at cC.d()
at cC.d(Document d, Document v, CompareOptions c)
at Aspose.Words.Document.Compare(Document document, String author, DateTime dateTime, CompareOptions options)

@ashiqshanavas Could you please provide your document and code here for testing?

@ashiqshanavas Please try the latest version of Aspose.Words. Although we still unable to compare at MS Word speed, comparison performance was seriously increased for huge documents.