0

I am very confuse about find the if the file is in english word. Let say that file contain some garbage letter and number but in that garbage letter and number it have a mean. Let say hello there in that file how do I set the code so that is know the word is in english?

5
Contributors
5
Replies
20
Views
2 Years
Discussion Span
Last Post by Taywin
0

How can I check with english dictionary? Is there a import that I can use for that?

1

You can read through a text file and compare the words in the file to a word of your(or your user's) choice, if there is a match, then the word is in english unless it is not

Edited by Slavi

1

The game "Scrabble" needs a list of valid English words. If you search the web you will find downloadable files with suitable word lists.

0

If you are talking about file content, then you may try to read in as binary. You could attempt to count for non-printable ASCII (anything less than 32 and greater than 127) characters. Compare the total number of non-printable ASCII with the total number of character read (using a ratio?). Then try to determine from there. The ratio could be adjusted depending on the number of character read in.

The method is not perfect, but it should save you a lot of time when a file is huge. Also, dictionary check approach is not a perfect solution either. You need to choose your own solution to match your purpose.

Edited by Taywin

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.