word count
I have a method that counts the number of words in a JTextArea. It works pretty good, except for the fact it counts characters that's not letters as words(such as "!@#$" would be a word)...
Here is the code that I have got so far(no erros, compiles and runs fine, just needs to be more specific in what it searches for)
public void processWordCount()
{
String data = textArea2.getText();
Scanner s = new Scanner(data);
Pattern p = Pattern.compile(" ");
String words = null;
int count = 0;
while (s.hasNext())
{
words = s.next();
count += 1;
}
JOptionPane.showMessageDialog(null, "Word Count: " + count);
}
server_crash
Postaholic
2,111 posts since Jun 2004
Reputation Points: 113
Solved Threads: 20
well, it works according to the standard definition of what a word is, which is anything delimited by whitespace.
Of course it's not complete as you fail to detect line breaks and tabs as word boundaries.
jwenting
duckman
8,392 posts since Nov 2004
Reputation Points: 1,662
Solved Threads: 337
What do you mean detect line breaks and tabs? Is this necesary.
server_crash
Postaholic
2,111 posts since Jun 2004
Reputation Points: 113
Solved Threads: 20
If you have a text that
has words on more than one line with
no space between them looking only
for spaces as word boundaries
will mean you see
a lot
less
words
than
you
should.
jwenting
duckman
8,392 posts since Nov 2004
Reputation Points: 1,662
Solved Threads: 337
I see what you mean, but actually the code I posted covers that. I tried this:
One
Two
Three
Four
On seperate lines withough any space, and it showed up as four words. I thought it would have the effect you were suggesting.
So do you personally think this would be ok, or would you make it more specific in what it defines as a word?
server_crash
Postaholic
2,111 posts since Jun 2004
Reputation Points: 113
Solved Threads: 20
yes, in your case it works for linebreaks because regular expressions only work on a single line.
It does however not work for tabs.
jwenting
duckman
8,392 posts since Nov 2004
Reputation Points: 1,662
Solved Threads: 337
Thanks man, that helped a bunch.
server_crash
Postaholic
2,111 posts since Jun 2004
Reputation Points: 113
Solved Threads: 20
Thanks for correcting that, I'm getting ready to test it.
server_crash
Postaholic
2,111 posts since Jun 2004
Reputation Points: 113
Solved Threads: 20