Character Count With Spaces Not Working in Aspose.Words for Java 1.0.2.0


#1

Hi,
I just downloaded your Aspose.Words component for Java to test its ability to perform character counts on Word Documents. It appears that the 1.0.2.0 version is not working properly. For my simple test document (attached) I get the following results:

Document Name = C:\WordCount_Testing.doc
Pages: 10 Actual versus 1 calculated.
Paragraphs: 60 Actual versus Paragraphs = 61 calculated.
Lines: 366 Actual versus Lines = 357 calculated.
Words: 4259 Actual versus Words = 4283 calculated.
Characters: 21563 Actual versus Characters = 21713 calculated.
Characters With Spaces: 25856 Actual versus numCharsWithSpaces = 0 calculated.

As you can see - the values start to diverge once you get to Words, Characters and CharactersWithSpaces (which doesn’t seem to work at all).

Could you look into this and let me know if you can fix it? The test code I’m using to access the document and obtain the counts is at the bottom of this post:

Thanks & Regards,
Richard Conway
reconway@egrok.com

package com.egrok.word;

import com.aspose.words.Document;

/** Tests the Aspose.Words library to see if it accurately obtains the correct character count from Word Documents
*/
public class TestAsposeWordLibrary {

private static String wordDocument = “C:\WordCount_Testing.doc”;

public TestAsposeWordLibrary() {
super();
}

public static void main(String[] args) {
// TODO Auto-generated method stub

//Open the test document and print out the word, line and character counts
Document doc;
try {
doc = new Document(wordDocument);

doc.updateWordCount();

int pages = doc.getBuiltInDocumentProperties().getPages();
int paragraphs = doc.getBuiltInDocumentProperties().getParagraphs();
int lines = doc.getBuiltInDocumentProperties().getLines();
int words = doc.getBuiltInDocumentProperties().getWords();
int numChars = doc.getBuiltInDocumentProperties().getCharacters();
int numCharsWithSpaces = doc.getBuiltInDocumentProperties().getCharactersWithSpaces();
int numCount = doc.getBuiltInDocumentProperties().getCount();

System.out.println("Document Name = " + wordDocument);
System.out.println(“Pages: 10 Actual versus " + pages + " calculated.”);
System.out.println(“Paragraphs: 60 Actual versus Paragraphs = " + paragraphs + " calculated.”);
System.out.println(“Lines: 366 Actual versus Lines = " + lines + " calculated.”);
System.out.println(“Words: 4259 Actual versus Words = " + words + " calculated.”);
System.out.println(“Characters: 21563 Actual versus Characters = " + numChars + " calculated.”);
System.out.println(“Characters With Spaces: 25856 Actual versus numCharsWithSpaces = " + numCharsWithSpaces + " calculated.”);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}//end method main
}


#2

I should have included the following info to assist you with your troubleshooting:

Aspose.Words vers 1.0.2.0 for JDK 1.5

My system:
Windows XP Pro SP2
JDK 1.5.0_6
Microsoft Word 2000


#3

Thank you for reporting this issue to us. Please attach the problem document. I’ll check what could be done.



Best regards,


#4

Hi,
Here is the test document again - but you should be able to reproduce this error with any Word 2000 document. I just tested it with a simple document with one line and 31 characters with spaces. (According to MS Word) and the code above got 0. The number of words was incorrect (6 actual vs 27 calculated).

Regards,
Richard


#5

Please mind that Pages and Lines counts are not calculated by Aspose.Words but rather taken as they are stored in the document. That is because we don't have document page layout engine implemented yet.

Concerning other count properties, I have logged this problem to our defect base as issue #1010. We will try to fix it in our next release which will be out in 3-4 weeks. You will be informed of the result here in this thread.

Best regards,


#6

I understand that the Pages and Lines counts are not currently calculated by Aspose.Words. I’m assuming that the Character, Character with Spaces and Word counts are calculated by Aspose. I look forward to the bug fix. If it works as advertised, you’ll have another customer.

Best Regards,
Richard Conway
eGrok, Inc.


#7

Hi, Richard,



The main trick is that you using evaluation license and Aspose.Words engine inserts evaluation watermark text into the document and right here adds to counts all its paragraphs, words and symbols:)



I checked your file with valid license and get correct numbers of Paragraphs, Words and Characters. So Character with Spaces is the only bug and we are intending to fix it within the next few days.



Best Regards,


#8

Thanks for clarifying that - it makes sense. Could you email me when the new version is available for Java at reconway@egrok.com please?

Thanks & Regards,
Richard Conway


#9

Ok, I will e-mail you when the new version will be avaliable.

Best regards,


#10

Aspose.Words for Java 1.0.3 is out. The issue was fixed.