I am very confuse about find the if the file is in english word. Let say that file contain some garbage letter and number but in that garbage letter and number it have a mean. Let say hello there in that file how do I set the code so that is know the word is in english?

Recommended Answers

All 5 Replies

well ... just check all the words in the file against an english dictionary

How can I check with english dictionary? Is there a import that I can use for that?

You can read through a text file and compare the words in the file to a word of your(or your user's) choice, if there is a match, then the word is in english unless it is not

The game "Scrabble" needs a list of valid English words. If you search the web you will find downloadable files with suitable word lists.

If you are talking about file content, then you may try to read in as binary. You could attempt to count for non-printable ASCII (anything less than 32 and greater than 127) characters. Compare the total number of non-printable ASCII with the total number of character read (using a ratio?). Then try to determine from there. The ratio could be adjusted depending on the number of character read in.

The method is not perfect, but it should save you a lot of time when a file is huge. Also, dictionary check approach is not a perfect solution either. You need to choose your own solution to match your purpose.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.