Free Support Forum - aspose.com

Word/Character Count - Better in PDF?

I already have a post that discusses this problem a bit, but it seems to have died out (http://www.aspose.com/Forums/ShowPost.aspx?PostID=25392).

Your latest statement was, that the count of words, lines, paragraphs etc properties are not updated by Aspose.Word.
If that is the case, then it should display exactly the same number as word does, no?

As an alternative, I think it’s possible to save the word as a PDF with Aspose.Word and then read it with the PDF module, but I could figure out if it is possible to use the PDF module to count the words.

Sorry to make such a bit fuzz about a few words, but I’m gonna loose a pretty big contract, just cause I cannot count words accuratly…

Thanks

Remy

Hi Remy,

Thank you for considering Aspose.

Yes, currently Aspose.Word just reads the properties from the document file. But the problem is that MS Word seems to recalculate document statistics so what it shows differs from what is stored in the file even on open, without applying any changes…

Right at the moment we are working on improvement of the document properties handling and we will add a method like UpdateProperties to recalculate the word/characters/paragraphs count. So we will add it very soon.

Ok, I guess I will try to implement that by myself then, since you said in another post this might take few months…
Might come up with some more questions then :slight_smile:

No, we’ve just finished working on it so I guess the possibility to recalculate the properties will be included very soon. Smile

Oh, cool!!!

I miss-read you last post
"If you mean an estimated timeframe for implementing the pagination engine, this is a few months."

Didn’t see that you mean the pagination engine…

Hi,

We have released Aspose.Word 3.1.

  • Added Document.UpdateWordCount that calculates number of characters, words and paragraphs in a document and stores the values in the corresponding Document.BuiltInDocumentProperties.
  • http://aspose.com/Blogs/Roman.Korchagin/

    Cooool, will check that out tomorrow.
    Thanks for the update.

    Hello Guys
    First, congratulation for the cool job with the new word count feature.
    I’ve tested it a bit and unfortuantely you guys produce a bit different results than MS Word. I’ve attached two examples. The first one (Test4_BulletList_NumberList.doc) has 29 Word according to Word, you guys count 28. The second one (Test3.doc) should have 850, but Aspose shows 850. Both documents are not very complex. First I assumed it could be your Watermark, but when I look at the exported document with the watermarks it has about 890 words.
    Any idea why this is?

    Thanks

    Remy

    Hi Remy,

    I’ve just tested your document, the Document.BuiltInDocumentProperties.Words property shows 29 words as expected. You probably forgot to call Document.UpdateWordCount before obtaining the properties, in this case the value stored in the document file is returned as I mentioned above; this value is indeed 28 and it is incorrect.

    What about the second document you mentioned? Please attach it unless calling UpdateWorkCount will help.