Hi,

Actually I am developing an application in python in which I want a feature by which the application can detect whether any name is male or female name. I did Google search on this, but could not find any algorithm/code by which I can do it. So finally I found a website which is doing the same thing pretty good.

http://www.i-gender.com/

So I was thinking to use their API in my application.

But before that just wanted to how they are detecting the gender from name?? Is it really possible to do it algorithmically??If yes please suggest some docs/links.

Here what I am trying to using http://www.i-gender.com/ API.

>>> import urllib2
>>> import json
>>> req = urllib2.Request("http://www.i-gender.com/ai", "name=jhony")
>>> resp = urllib2.urlopen(req).read()
>>> decoder = json.JSONDecoder()
>>> result = decoder.decode(resp)
>>> print result['gender']
male
>>> print result['confidence']
100
>>> 

Thanks in Advance,

I did Google search on this, but could not find any algorithm/code by which I can do it.

I imagine the best approach would be a lookup from a database of names and gender probability. At its simplest the gender neutral names would be 50% confident and the strongly gendered names 100%. It could be improved by including birth year along with probabilities based on that name's popularity for certain genders in that year. Another approach would be to collect the name and gender of as many people as possible in census collections and base your probability off of the difference between the two counts for a name. So if you have 100,000 Jamie(M) and 76,000 Jamie(F), then the confidence of Jamie being male is about 57%.

Ultimately it's quite difficult, if not impossible, to heuristically guess gender with any confidence based on a name without a great deal of empirical data behind the selection.

This is an interesting question.

Think about male/female names. I might be wrong, but more female names end in a vowel (not always). Male names sometimes end in a consonant moreso than vowels. Think Alex (male)/Alexandra (female); Cassandra, Krista, Alicia, Alissa, etc. all female. Of course there are hundreds of female names that end in consonants, but it seems the exception rather than the rule.

You could incoroporate such an algorithm to increase the confidence level.

I might be wrong, but more female names end in a vowel (not always). Male names sometimes end in a consonant moreso than vowels.

Not to such a statistically significant extent that using it as a heuristic would do more than muddy the waters. You also need to take frequency into account. If a very common male name ends in a vowel, it greatly skews the results: Joe, George, Mike, Luke, Charlie, Danny, Kyle, Dale, Lee, Lawrence, Joshua, Jamie, Dane, Shane, Blake, Leo, Jay...hmm. ;)

Think Alex (male)/Alexandra (female)

Yet Alex is a shortening of both Alexander and Alexandra. Males and females go by Alex, so you're not much better off unless they give you the full name. And it gets worse. Alex can be a full name in and of itself for both genders.

Ahhh very true. Perhaps we can exclude e's, y's and o's......

I guess there's no way to code something like this. I suppose the way we humans know names are how we've heard them in the past, hence a database w/ frequencies is the only way to go.

This article has been dead for over six months. Start a new discussion instead.