I am a student having my degree in computer science. I am in final year.My project superviser gave me a project on data mining.Its called "GROUP SELECTION BASED ON TEMPORAL DATA".For example: the project is to find out 4 or any given number of players out of 11 whose performance will be the best in a given time period.I searched in google for research paper on that topic.But didn't find any.Can anyone give me advice how to do that project? What languages do i need? Or can link where i can find research paper on it.It would be a great help for me if you do that.
You're in your final year of CS and your still wondering what languages you should use? I'm sure you have enough experience to figure that out for yourself.
It's sounds like this has more to do with statistics then CS tbh.
In any case, why not start by using your school libraries resources? You should be able access a lot of papers that arn't even publically availible on the internet from there.
Also, maybe pick up a book on statistics, possibly one specialising in sports statistics?
thank you for your suggestions.Now I am thinking of sport statistics books.Sorry,i asked which language i should use.Fyi I have to build the project using data mining and To be honest,i haven't worked before in the field data mining,so kinda freaking out.What i am asking here is which steps should i follow?i am thinking of some steps.1)Putting the statistics value in a database.2)Analyze them using a data mining algorithm.Besides there are also other factors which can affect player's performance like weather.So if you know about data mining,let me know is there a data mining algorithm to analyze sport statistics data or suggest me the steps i could follow.
Data mining is one method you can use if you wanted. Though, if you just needed to give a "strength" function to a team given time, location, etc... then I would thing traditional statistics would also work well.
If you want to use datamining, you are correct, you need to put the data in some kind of consistent database (preferable a database suited for scientific use cough hdf5 cough).
Correct, using this route you need to create an algorithm that will find corrilations within large sets of data.
Now the question is, how to you choose which corrilations are relevent? Futhermore, how do you visualise them in a way that get's the point across.
And of course you'll need criteria to determine what is "best"...
By now you should know that you are not to just dive in and start hacking some code together, but that 90% of your time is going to be spent thinking and researching.
And that does not mean dumping vague, extremely broad, questions on some forum and hoping someone will present you with a fully polished research paper or piece of software on a silver platter a few hours later.