word count

Please support our Java advertiser: Programming Forums - DaniWeb Sister Site
Reply

Join Date: Jun 2004
Posts: 2,108
Reputation: server_crash is on a distinguished road 
Solved Threads: 18
server_crash server_crash is offline Offline
Postaholic

word count

 
0
  #1
Dec 25th, 2004
I have a method that counts the number of words in a JTextArea. It works pretty good, except for the fact it counts characters that's not letters as words(such as "!@#$" would be a word)...

Here is the code that I have got so far(no erros, compiles and runs fine, just needs to be more specific in what it searches for)
  1. public void processWordCount()
  2. {
  3. String data = textArea2.getText();
  4. Scanner s = new Scanner(data);
  5. Pattern p = Pattern.compile(" ");
  6. String words = null;
  7. int count = 0;
  8. while (s.hasNext())
  9. {
  10. words = s.next();
  11. count += 1;
  12. }
  13. JOptionPane.showMessageDialog(null, "Word Count: " + count);
  14.  
  15. }
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,143
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 212
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: word count

 
0
  #2
Dec 26th, 2004
well, it works according to the standard definition of what a word is, which is anything delimited by whitespace.
Of course it's not complete as you fail to detect line breaks and tabs as word boundaries.
Reply With Quote Quick reply to this message  
Join Date: Jun 2004
Posts: 2,108
Reputation: server_crash is on a distinguished road 
Solved Threads: 18
server_crash server_crash is offline Offline
Postaholic

Re: word count

 
0
  #3
Dec 27th, 2004
What do you mean detect line breaks and tabs? Is this necesary.
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,143
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 212
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: word count

 
0
  #4
Dec 27th, 2004
If you have a text that
has words on more than one line with
no space between them looking only
for spaces as word boundaries
will mean you see
a lot
less
words
than
you
should.
Reply With Quote Quick reply to this message  
Join Date: Jun 2004
Posts: 2,108
Reputation: server_crash is on a distinguished road 
Solved Threads: 18
server_crash server_crash is offline Offline
Postaholic

Re: word count

 
0
  #5
Dec 27th, 2004
I see what you mean, but actually the code I posted covers that. I tried this:

One
Two
Three
Four

On seperate lines withough any space, and it showed up as four words. I thought it would have the effect you were suggesting.

So do you personally think this would be ok, or would you make it more specific in what it defines as a word?
Reply With Quote Quick reply to this message  
Join Date: Nov 2004
Posts: 6,143
Reputation: jwenting is just really nice jwenting is just really nice jwenting is just really nice jwenting is just really nice 
Solved Threads: 212
Team Colleague
jwenting's Avatar
jwenting jwenting is offline Offline
duckman

Re: word count

 
0
  #6
Dec 28th, 2004
yes, in your case it works for linebreaks because regular expressions only work on a single line.
It does however not work for tabs.
Reply With Quote Quick reply to this message  
Join Date: Oct 2004
Posts: 348
Reputation: paradox814 is an unknown quantity at this point 
Solved Threads: 4
paradox814's Avatar
paradox814 paradox814 is offline Offline
Posting Whiz

Re: word count

 
0
  #7
Dec 28th, 2004
you could use:
java.util.StringTokenizer
java.util.regex.Pattern
but when u use pattern, make sure that each string token contains at least something to the effect of [a-zA-z0-9], if it does then
count += 1;
Reply With Quote Quick reply to this message  
Join Date: Jun 2004
Posts: 2,108
Reputation: server_crash is on a distinguished road 
Solved Threads: 18
server_crash server_crash is offline Offline
Postaholic

Re: word count

 
0
  #8
Dec 28th, 2004
Thanks man, that helped a bunch.
Reply With Quote Quick reply to this message  
Join Date: Oct 2004
Posts: 348
Reputation: paradox814 is an unknown quantity at this point 
Solved Threads: 4
paradox814's Avatar
paradox814 paradox814 is offline Offline
Posting Whiz

Re: word count

 
0
  #9
Dec 29th, 2004
whoops i noticed in error in my regular expression, it should have been a capital Z
[a-zA-Z0-9]
Reply With Quote Quick reply to this message  
Join Date: Jun 2004
Posts: 2,108
Reputation: server_crash is on a distinguished road 
Solved Threads: 18
server_crash server_crash is offline Offline
Postaholic

Re: word count

 
0
  #10
Dec 29th, 2004
Thanks for correcting that, I'm getting ready to test it.
Reply With Quote Quick reply to this message  
Reply

This thread is more than three months old.
Perhaps start a new thread instead?
Message:


Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC