Infinite Hang on Document Conversion

Aspose Words Layout Engine Infinite Hang Bug Report

Issue Summary

Product: Aspose.Words for Java
Version: 26.3
Issue: Layout engine enters infinite loop during getPageCount(), updateFields(), or save() operations
Impact: Document conversion hangs indefinitely (tested > 1 hour)

Reproduction

Reproduction Document

File: stuck.docx
stuck.docx (50.6 KB)

Size: 44KB
Pages: 1 page
Note: Customer-sensitive information has been redacted from this document

Document Characteristics

Based on comparison with a working copy:

  • Style Collection: Only 17 styles (normal documents have 30-40+)
  • Structure: 1 section, 2 tables, 68 paragraphs, 3 shapes
  • History: Document accumulated 53 revisions before sanitization
  • Likely Issue: Corrupted internal structure from accumulated revisions

Reproduction Code

import com.aspose.words.Document;
import com.aspose.words.LoadOptions;
import com.aspose.words.PdfSaveOptions;

public class ReproduceLayoutHang {
    public static void main(String[] args) throws Exception {
        // Load document
        LoadOptions loadOptions = new LoadOptions();
        Document doc = new Document("stuck.docx", loadOptions);

        System.out.println("Document loaded successfully");

        // ANY of these operations will hang indefinitely:

        // Option 1: getPageCount() hangs
        int pageCount = doc.getPageCount(); // HANGS HERE

        // Option 2: updateFields() hangs
        // doc.updateFields(); // HANGS HERE

        // Option 3: save() hangs
        // PdfSaveOptions options = new PdfSaveOptions();
        // doc.save("output.pdf", options); // HANGS HERE
    }
}

Expected Behavior

  • getPageCount() should return quickly (< 1 second for 1-page document)
  • updateFields() should complete in < 5 seconds
  • save() should complete in < 10 seconds

Actual Behavior

  • All layout-triggering operations hang indefinitely
  • No exceptions thrown
  • No error messages
  • CPU usage remains steady (not a performance issue, it’s stuck in a loop)

Analysis

Root Cause (Suspected)

The document appears to have corrupted internal structure that creates circular dependencies in the layout engine:

  1. Style collection corruption: Only 17 styles vs 30-40+ in normal documents
  2. Accumulated revision history: 53 revisions may have caused internal inconsistencies
  3. Table metadata corruption: Document has 2 tables which are involved in the hang

Diagnostic Tests Performed

We systematically tested various hypotheses:

Test Result Conclusion
Remove all shapes :x: Still hangs Shapes not the issue
Remove all tables :white_check_mark: Works Tables involved but…
Modify table properties :x: Still hangs Can’t fix by changing table settings
Copy content to new document :white_check_mark: Works Internal structure is corrupted

Workaround Found

Solution: Copy all document content to a new document via importNode():

Document problematicDoc = new Document("problematic.docx", loadOptions);

// Create new document
Document newDoc = new Document();
newDoc.removeAllChildren();

// Copy all sections (deep clone)
for (int i = 0; i < problematicDoc.getSections().getCount(); i++) {
    Section section = problematicDoc.getSections().get(i);
    Section importedSection = (Section) newDoc.importNode(
        section,
        true,  // deep clone
        ImportFormatMode.KEEP_SOURCE_FORMATTING
    );
    newDoc.appendChild(importedSection);
}

// New document converts successfully!
int pageCount = newDoc.getPageCount(); // Works!
newDoc.save("output.pdf", saveOptions); // Works!

This workaround:

  • :white_check_mark: Successfully resolves the hang
  • :white_check_mark: Preserves all user-visible content
  • :white_check_mark: Strips corrupted internal metadata

Request to Aspose

Primary Request

Please investigate why this document causes an infinite loop in the layout engine and provide a fix in a future release.

Questions

  1. Is there internal diagnostic tooling to identify the specific corrupt structure?
  2. Can Aspose provide a document “sanitization” utility to fix such corruption?
  3. Are there known issues with documents that accumulate many revisions?
  4. Can the layout engine be made more resilient to corrupted internal structures?

Suggested Fix Approaches

  1. Add timeout protection to layout engine operations
  2. Add circular dependency detection to prevent infinite loops
  3. Improve error reporting when corrupt structures are detected
  4. Document validation utility to identify corruption before conversion

Additional Context

Customer Impact

This issue affects production conversion workflows where customers upload documents with unknown history. Documents that:

  • Have been edited extensively (many revisions)
  • May have been created in older Word versions
  • May have SharePoint metadata

Sanitized Document Details

The attached reproduction document has been sanitized for privacy:

  • :white_check_mark: Customer names, company info redacted
  • :white_check_mark: Custom properties cleared (SharePoint metadata removed)
  • :white_check_mark: Author information anonymized
  • :white_check_mark: Internal structure preserved (still reproduces hang)

System Information

  • Java Version: Java 17
  • Aspose.Words Version: 26.3
  • Original Document: Created in Microsoft Word (version unknown)
  • File Format: DOCX (Office Open XML)

Attachments

  • stuck.docx - Reproduction document
  • Comparison data showing structural differences vs working document

Support Request

Please investigate this issue and provide guidance on:

  1. How to identify such corrupted documents before conversion
  2. Whether a fix can be provided in the layout engine
  3. If the importNode() workaround is the recommended approach

Thank you for your support!

@pblad
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSJAVA-3327

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

The issues you have found earlier (filed as WORDSJAVA-3327) have been fixed in this Aspose.Words for Java 26.4 update.