Hi folks,

I'm building a database to store student exam results so that transcripts can automatically generated. This is a final year project for university so the database will never be used. I am a complete amateur when it comes to designing databases.


I have attached the relational schema for the database. I'd be grateful if someone who knows about proper design could tell me if its ok (from the point of view referential integrity and normalization etc).

I'd really appreciate helpful comments.

Regards,

sabatier

Just to explain a few things:
1) In the OverallMarks table, MarkType refers to whether the overall mark (i.e. 1st honours, 2nd honours) is for JF (Junior Freshman), SF(Senior Freshman) and so on.
2) In the ExamResults table, YearOfStudy refers to JF, SF, JS or SS. Year refers to the actual year the exam was sat e.g. 2007.
3) The data in the database will be coming from Excel spreadsheets, so my database will not be doing any calculations on the data. That's why I'm storing the overall mark instead of calculating it from the ExamResults table.

First off, the only two tables that are in any sort of normal form are Students and Subjects.
The others do not have a primary key. Secondly, you are repeating information from the StudentSubjects to ExamResults table. It seems to me that the StudentSubjects table is entirely redundent, further denormalizing the design. While there may be practical reasons for denormalization, (mostly reducing the need for large and expensive joins to increase query speed) i have the feeling that in an academic setting, this would be frowned upon. unless they actually wanted to *use* the database of course ;-)


So...

1) Eleminate Student subjects
2) Stick a primary key on everytable, even if you don't think you need one. For the tables that don't already have one an identity column should do.
3) I would make sure that every row in the OverallMarks table is for a year in which the student actually took a class. In sql, you could do this via a trigger or by denying insert/update access to the users and forcing them to use a stored procedure instead. you can't use checks because you're referencing another table, so it will be expensive one way or the other. perhaps it would be better done in the interface if you are not worried about bandwidth. The data is small so bandwidth shouldn't be to big a deal. If you take this route i would most surely slap an index on the StudentID column of of the StudentClasses table to speed it up.

Further on the StudentSubjects table...

You have a one-to-many relationship between StudentSubject and ExamResults. Is that a true representation of life, in other words can one student, for one subject, have more than one exam result (mid-term, finals, etc.)?

If yes (which I suspect), the ExamResults table needs one more bit of info about which exam this is. You could 1) add an ExamNo field in it; 2) add a table Exam with an ExamCode PK, and then an ExamCode FK field in ExamResults; 3) add an ExamDate field in ExamResults. Personally I would go for option 2 and list all possible exams in that school with fields like: ExamCode (PK), SubjectCode (FK), ExamDate, Building, Classroom, etc.

In any case, I agree with Aaron's suggestions above: remove StudentSubject table and replace with the modified ExamResults, make sure every table has a PK, and reinforce your constraints.

Further on the StudentSubjects table...

You have a one-to-many relationship between StudentSubject and ExamResults. Is that a true representation of life, in other words can one student, for one subject, have more than one exam result (mid-term, finals, etc.)?

If yes (which I suspect), the ExamResults table needs one more bit of info about which exam this is. You could 1) add an ExamNo field in it; 2) add a table Exam with an ExamCode PK, and then an ExamCode FK field in ExamResults; 3) add an ExamDate field in ExamResults. Personally I would go for option 2 and list all possible exams in that school with fields like: ExamCode (PK), SubjectCode (FK), ExamDate, Building, Classroom, etc.

In any case, I agree with Aaron's suggestions above: remove StudentSubject table and replace with the modified ExamResults, make sure every table has a PK, and reinforce your constraints.

Thank you both very much for your replies. I did as you suggested; I got rid of the StudentSubjects table, and put on an ExamNo primary key on the ExamResults table. LeBurt, I went with your option 1 because its a very simple application and the only information I have about an exam is whether it was done in June or in September (no specific dates). I think (hope) I'll get away with that for this project :-)

Cheers,

sabatier

Good luck!

One last thing. If there is enough information to calculate the overall mark from the database then you might want to go ahead and do it as it is part of the notion of a normalized relational database to not store information that can be calculated

Sabatier, could you please post a new pic of your diagram (that is if you updated it with all the suggestions). I'm working on something similar in my class. Thanks bud.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.