We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Inaccurate/incorrect word count


I am currently evaluating your product as a replacement to using MS Word's interop calls from our ASP .NET application. The initial setup and coding was easy, however I do have one very serious problem - the word count I get from the aspose words tool differs greatly from the MS Word word count. Unfortunately for us, word counting is critical for our application as our clients get charged per 'chunk' of words they submit to us. I have attached a sample document which contains a range of text, images, 2 pasted excel tables and 1 pasted excel table (pasted as a special object which I don't expect to be counted).

Word calculates the word count to be 740 words - If I count the words manually I also arrive at 740 words. I use two scenarios in aspose.net to count the words:

1) Accessing the BuiltInDocumentProperties.Words WITHOUT calling UpdateWordCount() first and that arrives at 664 words

2) calling UpdateWordCount() first and then accessing the BuiltInDocumentProperties.Words afterwards. That arrives at 418 words

Both hugely differ from the MS Word count and my count by hand! I wouldn't mind a difference of a word or two but this huge difference would make potentially hundreds of dollars of cost difference to us and our clients and thus makes this tool unusable. I am pasting my code below just in case I am doing something wrong here. I hope you can advise me (and quickly) as to why the count is so inaccurate and what I can do about it?

MemoryStream wordDocumentStream = new MemoryStream(wordInputData);

Document doc = new Document(wordDocumentStream);


//Tell the document to update its calculation of the word count in the document

//and then extract the word count itself...

doc.UpdateWordCount(); //I comment this out sometimes to see what difference it makes

int numberOfWords = doc.BuiltInDocumentProperties.Words;


Just a thought that occurred to me, is the word count off because I am using a trial and you folks are truncating the document BEFORE I can extract the word count? It still doesn't explain why the word counts with and without the UpdateWordCount() call are so different but I thought I'd ask.

As an accurate word count is as important to us as the actual word handling, I would need to feel confident in the product before buying a license. If you folks really think the trial trunkating is the problem, is there any way you can let me trial the product without this limitation so we can feel confident in the product before buying?

Yup. This was solved once I applied the evaluation license file.



<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

It is perfect, that you already found the way to resolve the problem.

Best regards,