I initially spoke with your sales team regarding licensing and pricing, and they kindly directed me to the forums for more detailed technical and performance-related questions. I’m therefore providing the requested technical details below so you can better advise on performance, scalability, and limitations.
1. Product Version
We are currently evaluating:
- [Aspose.Words / GroupDocs.Comparison] Cloud API (latest available version)
- We are also considering the SDK / self-hosted deployment option for higher performance and fewer timeout constraints.
Please let us know if there is a specific recommended version for large-document workloads.
2. Programming Environment & Language
Our application stack is:
- Frontend: Next.js (TypeScript)
- Backend: Node.js / TypeScript API layer
- Document processing is handled server-side.
3. Evaluated Options
- For cloud usage, we would use your hosted API offering.
- For self-hosted / SDK usage, we would be open to any deployment approach as long as it can run on Linux.
4. Requirements Summary (Key Points)
Document Comparison (Redlining)
Must support:
- DOCX, XLSX, PDF comparison
- Output for DOCX comparison must be a Microsoft Word Track Changes document (not only a rendered PDF diff)
Document Conversion
Must support:
- DOCX → PDF
- XLSX → PDF
TOC / Field Code Update (TOC Template)
A critical requirement is:
- We generate the TOC in DOCX using a TOC template XML / Word field structure.
- Currently the TOC is not automatically updated correctly in our pipeline.
- We need the service to recalculate Word field codes / update fields programatically so that the TOC is populated correctly without requiring the user to manually open Word/LibreOffice and update the TOC.
5. Performance & File Size Questions
During initial testing with larger DOCX documents, we observed that:
- Cloud comparison requests timed out after approximately 20 minutes
Could you clarify:
- Maximum supported file sizes (Cloud vs SDK)
- Whether the Cloud timeout limit is configurable/extendable
- Recommended best practices for processing large regulatory-style documents
- Whether async/background job processing is available to avoid request timeouts
The larger the documents we can process reliably, the better.
6. Security & Safety Considerations (SDK)
Since we may process user-uploaded documents, security is an important consideration for us.
Could you clarify:
- Does the SDK include built-in safeguards against malicious or malformed documents?
- How are potentially dangerous elements handled (e.g., embedded macros, external references, scripts, malformed XML, zip bombs)?
- Are there configurable limits for:
- Memory usage
- CPU usage
- Maximum document structure depth
- Maximum embedded object size
- Are documents fully parsed in memory, or is streaming supported?
- Do you provide any sandboxing recommendations for Linux deployments?
For example, if a user uploads a maliciously crafted DOCX file designed to exhaust resources or exploit the parser, what protections are in place within the SDK?
Thanks again — we’d be happy to schedule a technical call if that’s the fastest way to confirm these requirements.
Best regards,
Risto-Matti
Senior Software Engineer