We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Big word documents

Hi support,

Is there any way that Aspose.Words for Java can write really big documents, with tens or even hundreds of thousands of pages, and faster than it does now ? From our tests, we detected an upper limit around a few tens of thousands of pages, but as we try to push up the limit, we run into out of memory errors.
The code is something like :

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
System.out.println(System.currentTimeMillis());
String word = “”;
for(int x = 0; x < 500; ++x)
{
word += "a ";
}

for(int i = 0; i < 100000; ++i)
{
builder.writeln(word);
}
System.out.println(System.currentTimeMillis());
doc.save(“C:\Temp\docBigDocument.doc”);
System.out.println(System.currentTimeMillis());

The saving process (45 seconds) takes a much more longer than the writing process (2.2 seconds).

Also noticed when saving docx files that the document size limit is bigger; with docx, 200.000 words could be saved, in 42 seconds, but when trying to save this in a doc, we got out of memory error.

Regards,
Milan

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your request. Memory usage depends on document size, format and document’s content. Usually Aspose.Words needs few times more memory than document size to build model of the document in memory.

Also, I would like to say that producing huge MS Word documents is not very good practice. Ms Word does not like huge documents. Usually it takes a lot of time to open such documents in MS Word and sometimes MS Word just hangs. Normal size of MS Word documents is 100 – 200 pages.

But in the meantime the only way you can process really big documents if you give more heap space to your Java virtual machine. Aspose.Words will take loads of memory when loading the document, but when you finished processing the document, all memory will be released and garbage collected quickly. So it will be only a short spike of high memory use.

Best regards,

Hi Andrey,

What about the difference of performance between saving doc versus docx format ? Is this controlled by the library ? Is the docx model smaller than the doc model ?

Also, what is the real size limit in both type of formats (pages, content, etc) ?

Regards,
Milan

Hi Milan,

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your inquiry. Please see the following link to learn more about Aspose.Words Document Object Model (DOM) and its relationships:

http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/aspose-words-document-object-model-1.html

MS Word document is flow document and does not contain any information about its layout into lines and pages. Therefore, technically there is no “Page” concept in Word document. So I cannot tell you how many pages…

There is no any size limit; all depend on RAM installed on your PC and complexity of operations you perform with the document.

Best regards,