am working on a project to write a program that finds 10 most used words in a text, but i got stuck dont know what i should do next can someone help me please.

i come this far only

import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Scanner;
import java.util.regex.Pattern;


public class Lab4 {

    public static void main(String[] args) throws FileNotFoundException {

        Scanner file = new Scanner(new File("text.txt")).useDelimiter("[^a-zA-Z]+");



        List<String> words = new ArrayList<String>();

        while (file.hasNext()){
            String tx = file.next();
           // String x = file.next().toLowerCase();
            words.add(tx);

        }

        Collections.sort(words);
       // System.out.println(words);

    }

}

Recommended Answers

All 6 Replies

You could build a HashMap<string, Integer> with the word as the key and a count as value. For every word in the file, if it's already in the Map add one to the count, if it's not already in the Map add it with a count of one. When you've finished you can loop thru the Map to find the highest count.

The HashMap is a good suggestion, although I want to add something to it.
Personally I wouldn't recommend putting an Integer in it, since Integer is immutable, and you'll have to issue a put() everytime you want to update the count.
I suggest you create a Counter class that wraps around an int field and provides an increment() method. Something like this:

public class Counter
{
    private int count;

    public Counter()
    {
        this(0);
    }

    public Counter(int seed)
    {
        this.count = seed;
    }

    public void reset()
    {
        this.count = 0;
    }

    public void increment()
    {
        this.count++;
    }

    public int getCount()
    {
        return this.count;
    }
}

Everytime you fetch a word do the following:

Counter c = countsByWords.get(word);
if (c == null)
{
    // First time we encounter this word, create a counter for it
    // and put it in the Map.
    countsByWords.put(word, new Counter(1));
}
else
    c.increment(); // increment the count for this word

EDIT #0: Can anyone please link me to the post formatting stuff? I've been away for quite some time, and pretty much alot of things seem to have changed. I'd like some decent code tag formatting for the language I intend the code example for, and somehow the editor assumes I want non-decent code formatting.

EDIT #1: Just refreshed the page, and code highlighting is in place... Is the code highlighting handled automatically now?

I'm not sure why you see put as a problem, but if you want a mutable integer value there's no need to write a class. You could use an AtomicInteger, or just use a HashMap<String, int[]> with a 1-element int array as the value and keep incrementing its zero'th element

commented: Nice suggestions, forgot about these ;) +13

I'm not sure why you see put as a problem, but if you want a mutable integer value there's no need to write a class. You could use an AtomicInteger, or just use a HashMap<String, int[]> with a 1-element int array as the value and keep incrementing its zero'th element

I pointed that out because it is possible to avoid the overhead of put() each time a count needs to be updated (except for the first time that a word is counted).
For the remainder I agree with the rest of your post though.

Scanner file = new Scanner(new File("text.txt")).useDelimiter("[^a-zA-Z]+");

I just want to clarify your program on this line. Do you intend to count only words that contains only alphabets? If there is a word with a hyphen in the middle, do you count them as 2 words or one? (i.e. Is your job a part-time or full-time?) If you count each of them as two words, it may not be correct. Also, how about a word with number? (i.e. My constructor1 and constructor2 implementation are totally different.) Just my 2 cents...

i used " .useDelimiter("[^a-zA-Z]+") " so it only read words from a to z nothing else.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.