Questions about document redlining performance

I initially spoke with your sales team regarding licensing and pricing, and they kindly directed me to the forums for more detailed technical and performance-related questions. I’m therefore providing the requested technical details below so you can better advise on performance, scalability, and limitations.


1. Product Version

We are currently evaluating:

  • [Aspose.Words / GroupDocs.Comparison] Cloud API (latest available version)
  • We are also considering the SDK / self-hosted deployment option for higher performance and fewer timeout constraints.

Please let us know if there is a specific recommended version for large-document workloads.


2. Programming Environment & Language

Our application stack is:

  • Frontend: Next.js (TypeScript)
  • Backend: Node.js / TypeScript API layer
  • Document processing is handled server-side.

3. Evaluated Options

  • For cloud usage, we would use your hosted API offering.
  • For self-hosted / SDK usage, we would be open to any deployment approach as long as it can run on Linux.

4. Requirements Summary (Key Points)

Document Comparison (Redlining)

Must support:

  • DOCX, XLSX, PDF comparison
  • Output for DOCX comparison must be a Microsoft Word Track Changes document (not only a rendered PDF diff)

Document Conversion

Must support:

  • DOCX → PDF
  • XLSX → PDF

TOC / Field Code Update (TOC Template)

A critical requirement is:

  • We generate the TOC in DOCX using a TOC template XML / Word field structure.
  • Currently the TOC is not automatically updated correctly in our pipeline.
  • We need the service to recalculate Word field codes / update fields programatically so that the TOC is populated correctly without requiring the user to manually open Word/LibreOffice and update the TOC.

5. Performance & File Size Questions

During initial testing with larger DOCX documents, we observed that:

  • Cloud comparison requests timed out after approximately 20 minutes

Could you clarify:

  • Maximum supported file sizes (Cloud vs SDK)
  • Whether the Cloud timeout limit is configurable/extendable
  • Recommended best practices for processing large regulatory-style documents
  • Whether async/background job processing is available to avoid request timeouts

The larger the documents we can process reliably, the better.


6. Security & Safety Considerations (SDK)

Since we may process user-uploaded documents, security is an important consideration for us.

Could you clarify:

  • Does the SDK include built-in safeguards against malicious or malformed documents?
  • How are potentially dangerous elements handled (e.g., embedded macros, external references, scripts, malformed XML, zip bombs)?
  • Are there configurable limits for:
    • Memory usage
    • CPU usage
    • Maximum document structure depth
    • Maximum embedded object size
  • Are documents fully parsed in memory, or is streaming supported?
  • Do you provide any sandboxing recommendations for Linux deployments?

For example, if a user uploads a maliciously crafted DOCX file designed to exhaust resources or exploit the parser, what protections are in place within the SDK?


Thanks again — we’d be happy to schedule a technical call if that’s the fastest way to confirm these requirements.

Best regards,
Risto-Matti
Senior Software Engineer

@ristomattip,

As one of our colleagues from the Aspose.Words team (@alexey.noskov FYI.) will assess your inquiries and provide you with detailed information regarding your needs, here is a comparison and conversion breakdown for handling XLSX files considering the performance and redlining specifications you mentioned.

  1. XLSX Comparison (Redlining)

For “Redlining” in spreadsheets, the behavior differs slightly from Word’s Track Changes because Excel does not have an identical native “Track Changes” engine for document-level diffs.

  • GroupDocs.Comparison (Recommended for Redlining): This is the specific tool designed for this task GroupDocs.Comparison. It supports XLSX comparison and generates a result file where differences (additions, deletions, and style changes) are visually highlighted 25.11.0.
  • Aspose.Cells (Alternative): If you use Aspose.Cells, comparison is typically handled by manually iterating through cells to detect value or formula differences, refer to the thread. It is more suited for data-level reconciliation than visual redlining.
  1. XLSX → PDF Conversion

Both the Cloud API and the self-hosted SDK provide high-fidelity Excel-to-PDF conversion.

  • High Fidelity: The Aspose.Cells SDK maintains original formatting, charts, and layout.
  • Formula Recalculation: A critical step for your workflow: before saving to PDF, you should call Workbook.CalculateFormula(). This ensures that all calculated values are up-to-date in the final PDF output.
  • Advanced Layout Options: You can use PdfSaveOptions to control how the spreadsheet fits on a page, such as forcing all columns onto a single page or converting specific worksheets only.

@ristomattip If I understand your requirements correctly, you would like to use Aspose Cloud API. If so, you should post your questions in Aspose Cloud forum.
You can learn how to compare documents using Aspose.Words Cloud here:
https://docs.aspose.cloud/words/compare/
And about document conversion here:
https://docs.aspose.cloud/words/convert/word-to-pdf/

If you would like to achieve the same using on-premises version of Aspose.Words for Node.JS you can learn how to achieve this here:
https://docs.aspose.com/words/nodejs-net/compare-documents/
https://docs.aspose.com/words/nodejs-net/convert-a-document-to-pdf/

Aspose.Words supports updating TOC. So there will be no need to open the resulting document in MS Word to update it.

It would be better to ask the questions it Aspose Cloud forum. My colleagues from Cloud team will help you shortly. This forum is about on-premises versions of Aspose products.