I have to design a database where one requirement says "For each project, for each user, for each skill, store the marks gained".

Initially, i had a table where each skill was a column(proj_id, user_id, skill1, skill2, ......., skill_N)

I felt this was a bad design because the number of skills is not known(few may be added in future) and also if there were 100 skills but a user had only 5 skills, the rest 95 columns will be NULL.

Now, i have modified the design as (proj_id, user_id, skill_id, marks). The problem with this is there will be many records. Suppose there are 1000 projects, 100 users per project and average of 5 skills per user in a project, then number of records = 1000 x 100 x 5 = 500,000

There is a similar table where such data needs to be stored monthly where the number of records may keep adding(500,000 then 1,000,000 then 1,500,000.....so on)

My team members suggested i use separate table for each project but then i felt this also bad design because the number of tables will be more and also if a table structure needs to be modified, all those tables need to be modified(because they all will have same structure)

can someone suggest how can i proceed??

Recommended Answers

All 3 Replies

Ok with a relational database you create a table to hold a specific piece of information. The link the information in the tables together with the record IDs for the information in the other tables. I would create 4 tables projects, users, skills, user_rating

Projects 
ID int 11 primary key
Project_Name varchar 55
Project_description varchar 255

Users
ID int 11 primary key
User_Name varchar 55
User_Address varchar 255

skills
ID int 11 primary key
Skill_Name varchar 55


User_Rating
ID int 11 primary key
Project_ID int 11
User_ID int 11
Skill_ID int 11
Rating int 11

Then you create a record for a skill for a user for a project. Does that make sense?

Hey rch1231,

The user_rating table which u suggested even i had. Read my question properly. I said the table(user_rating in ur case) will be very bulky as for each month, around 500,000 records will be inserted. Performance will be an issue.

I read your question and I still stand by the structure I suggested. This uses the least amount of space and ties all of the data together where it can be queried quickly. It also allows for the number of skills, projects and users to be unlimited. With your method if you add a 101st skill you have to add a new field to the table.

Your statement "Suppose there are 1000 projects, 100 users per project and average of 5 skills per user in a project, then number of records = 1000 x 100 x 5 = 500,000" implies that every user in every project would add a new record for their modified skill rating. Or would you simply be updating some of their skill ratings and adding records for others.

Building the correct indexes to support the tables is the trick since then you only access the records you need. I manage several large databases for the company I work for and the average table has over 350,000 records with the largest holding over 7 million. I frequently pull joined tables containing a million rows of data from the databases is less than a minute.

The really big tables are in Oracle.

What you would need to look at next is what data needs to be accessed on a regular basis and what for? What data can be archived because it is no longer current or the user is no longer part of the system.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.