How to discard stopwords contained in class and do word count?
Expand Post »
package umadas.examples;import java.io.*;import java.util.*;class Counters{int iCount = 1;} public class Wordfreq { public static Reader r; private static String str=""; public static void main(String args[]) { try { r = new BufferedReader(new FileReader(args[0])); fnWordFrequency(); r.close(); } catch (IOException e) { System.err.println(e); } } public static void fnWordFrequency() { HashMap map = new HashMap(); try { StreamTokenizer st = new StreamTokenizer(r); int iToken = st.nextToken(); while (iToken != StreamTokenizer.TT_EOF) { if (iToken == StreamTokenizer.TT_WORD) { if (map.containsKey(st.sval)) { ((Counters)map.get(st.sval)).iCount++; } else { map.put(st.sval, new Counters()); } } iToken = st.nextToken(); } } catch (IOException e) { System.err.println(e); return; } Collection entries = map.entrySet(); Vector vTor = new Vector(entries); Iterator it = vTor.iterator(); Map.Entry ent; String sWord; int iCounter; while(it.hasNext()) { ent = (Map.Entry)it.next(); sWord = ((String)ent.getKey()); iCounter = ((Counters)ent.getValue()).iCount; str+=sWord+"\t"+iCounter+"\t"; System.out.println(sWord + "\t" + iCounter); } try { OutputStream f1=new FileOutputStream("C:/trial/umadas.txt"); byte buf[]=str.getBytes(); f1.write(buf); System.out.println("\n"); } catch(IOException io) { System.out.println(io.getMessage()); } } }THIS PROGRAM TOKENIZES THE WORD READ FROM A FILE IN COMMAND LINE AND PRINTS THE FREQUENCY OF EACH WORD.EX PROGRAM 25.NOW I HAVE A CLASS CONTAINING STOPWORDS LIKE( A ,AN) CONTAINED IN A HASHTABLE.I WANT TO READ THIS STOPWORDS FILEAND DISCARD THOSE AND PRINT THE ABOVE OUTPUT(DISCARDING THIS STOPWORDS).KINDLY HELP
No one has posted to this discussion for at least three months. Please let old threads die and do not reply to them unless you feel you have something new and valuable to contribute that absolutely must be added to make the discussion complete. Otherwise, please start a new thread in this forum instead.