I have developed a program for a word game that reads from a text file containing a list of all legal English words, then stores those words in an array used by other parts of the program. The program works, but I am not very happy with the word list that I have used as a text file since it seems to contains words that don't exist like "abd", acronyms, several-word phrases, proper names, chemistry elements like "Cm", and worst of all, swear words. I have found many a word list, but they all seem to either be specialized word lists, contain only commonly used words, contain arcane words that haven't been used since the 15th century, contain words that appear to not exist, contain acronyms or proper names, or most bizarrely, truncate words after eight letters.

I am looking for a regular old text file containing what would be found in a large modern dictionary like Merriam Webster, which has a good online dictionary but (not surprisingly) doesn't furnish a text file of their dictionary. I don't need the definitions, only the words themselves. Any ideas? Thanks.

Recommended Answers

All 12 Replies

Thanks for the tip, vijayan121. It's better than the one I had for sure. It still has a bunch of "words" with numbers in them, proper names, acronyms, and words like "abd", which I'm pretty sure is not a word. However, like I said, it's better than what I have now and if need be, I can write a little program that strips out words with digits, capital letters, etc., and get a cleaner version. That'll get most of the dubious words out, but I am worried about the words like "abd". Before I do that, though, does anyone else have any leads on a cleaner word list?

Thanks. Definitely better off than I was before.

I doubt you will ever find a perfect word list -- afterall even Websters dictionary contains foreign words, acronyms, abbreviations etc. Then there's the issue of whose version of English do you want to use? Oxford English, The Queen's English, American English, etc. Then what will you do with all the foreign words ?

O.K. I guess since it's my game I get to make the call as to what is a word and what is not. With my vocabulary, I'm not sure I'm up to the job. ;) I think I'll just weed out the ones with capital letters and digits, phrases, periods, etc., and maybe put the resulting document through spell-check and hopefully there'll be few enough "misspellings" to take it on a case by case basis. Thanks guys.

SOWPODS, the Scrabble dictionary, might be just the thing. If it's 'legal' in Scrabble, should be legal for your game as well. Just search for "SOWPODS".

I doubt that Vernon is still developing this game, four years later. But the scrabble dictionary is a good suggestion that others might benefit from. For that reason, I won't vote down your post, but be careful in the future not to resurrect threads from years past.

Member Avatar for iamthwee

This thread is old!

It can stil help someone :)

This has helped me :) Cheers all

@cbielich thanks bub u made my day.......

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.