Ι have many text files with total space 4 GB. In Russian, Greek and English languages.

Is there a way - program - software to find the most common - frequent words in these files?

I want it to produce a list ordered from most to least used words.

I know only C and Matlab. Thanks in advance.

Recommended Answers

All 2 Replies

to reduce code lines, i'd suggest You use C++ <map> .. it's much more easier .. you just need a helper function named, maybe (split) which is a vector of string that takes as an argument a const string reference ...

get the sample code here

You just have to add your file handlers so instead of the above program getting input from the keyboard, it gets it from the specified file ...

******* HAVE PHUN C0DiNG *******

commented: Good advice +14

Hi, I am not sure but probably a software called "Crawler" may fix your issue.
This crawler is used by many websites also like Google etc.
May be it would be of some help to you as well.

commented: nice attempt to solve the question +0
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.