Hello everyone,

I am creating a webapp of employees. Now each employees has fields such as loginId, firstName, lastName, email, last review date, last reviewer, last review type and I can go on and on.

My question is each employee has ONLY last review. So what is the best way to do this? Include all fields in user table, or create separate review table?


You have column last review type which implies there are diversified last reviews. So they also may differ in their fields/parameters. Obviously this is kind of ISA-relationship (generalization/specialization), therefore I would put the data of last reviews in separate tables.

Addition: If your last reviews only differ slightly, gathering all fields together in general table seems possible (however, this would violate some normalization requirements).

-- tesu

I have to keep track of just last review for an employee, so when a review is done for an employee, it would replace last reviews details. So, if I create a review table and create 1 to 1 relationship with employee, would that be waste of storage space?

So, schema would be:
employee (id, firstname, lastname, email, gender)
last_review(id,type,date,reviewer, employee_id)


employee (id, firstname, lastname, email, gender,review_type,review_date,review_reviewer)

Which one is a better design?

What is the meaning of review_type in third schema ?

Now that you have also shown schema last_review which has just 5 attributes, it is obviously that the 3rd schema is more suitable, especially if there would be only an 1 to 1 relationship of employee with last_review.

I thought column review_type were to identify various sorts of last-review information. But this is obviously wrong.

-- tesu

review_type is more like review_description, mid summer/end of year.

Yes, I agree. 3rd schema is the best option. Since I dont have that much experience, I just wanted to know if I was on the write track.

It is sometimes better to make things a little bit more complex than they need to be right now if you can foresee the possibility of needing that complexity. In particular, moving toward a normal form is often worth doing to avoid having to stop everything while you rebuild a database to include a few more fields in some table or other. For most apps, most of the time, it doesn't matter to shut the database down for a few minutes or even hours, but sometimes it does, and managers seldom understand the need. I think of it as "preventive programming" also known as "out-think the requirements". You should not spend more than a little extra effort though: You want to be efficient now, and look wise in the future when the boss says "how hard to do blah blah blah?" and you get so say: "it is already set up: Just give me the data."

Additionally, What would be a best design to add schools and majors for an employee.
An employee can go to many schools, earn degree in many majors at a school. I also wanna keep track of which state a school is in.
I was thinking of following design.

employee(employee_id, .....)
college(college_id, college_name, state_name) [many to one relationship from college to state]

I am not sure how would I relate a major from employee to school.


Because of many-to-many relationship you need a linking table:

create table employee_college(employee_id, college_id, ... here you can add further data
e.g. date of graduation etc);

where pk is (employee_id, college_id), both are also foreign keys.

Hint: I myself name foreign keys of child tables and primary keys of father tables always equally.
Advantage: joins are very simple to construct without any join constraint in where-clause or ON conditions.

As for the 1 to many relationship between college and state: Foreign key state_name in table college already defines this 1-to-many. You should define this foreign key explicitly:

create table college (....state_name char(3)... foreign key (state_name) references states_table on delete RESTRICT); -- restrict is default, therefore on-delete clause can be omitted.

Don't use this "on delete CASCADE" that much (otherwise your children would be killed automatically if one of their fathers dies).

-- tesu

Thanks for helping. I appreciate it very much. That works.

One last question, I am using MySQL as DB. How should I store a phone number? BIGINT / CHAR(10)? I am not sure if searching string is any better than searching for integers.

Just think of alpha-numeric phone numbers such as 867-5309/Jenny. Therefore always char-type without any restriction. I suggest varchar(12). There is no minus when searching for alpha-numeric values.

-- tesu

There is a plus side to storing phone numbers as alpha-numeric: Partial match is feasible. (Look for all the phone numbers in a particular exchange, or with a particular set of area codes or ...)

Thanks a lot guys for helping out. One thing skipped my mind, multiple majors. An employee can have multiple majors and degrees at a university.
This is how I think should be done. Am I correct? Suggestions?

employee (employee_id, first_name, ,last_name, ,email,....)
college(college_id, mnemonic, name, state)
education(employee_id, college_id)
employee_major(major_description, employee_id, college_id)

Underline - PK
Bold - FK

That works.

I would rename `education` to `employee_college` which is just a habit of mine for naming join tables, and not a real requirement.

It would be possible to have
college_major(college_major_id, major_name, major_description, college_id)) and also
Assuming most employees go to school 'nearby', then you would likely have multiple employees with the same major from the same college, a minor savings in space, but also more normal.

You may want to think about:

  • How to deal with an employee who is registered, enrolled, done with a degree
  • What if the employee does not get a degree at all?
  • What if the employee has not declared a major?

It won't be becoming that complicated. So, I thnink I can pass on that, griswolf. Right now, for employee id, I am just using auto-increment, but employees will be deleted. So, that would make those ints unusable. How can I better create primary keys in MySQL? GUID? Does it have any performance issues?

The number of potential auto incremented integers is sufficient to handle all but ridiculously large numbers of employees, even with significant turnover (32 bits: 4,294,967,295 distinct ids: http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html). I would not worry about any deleted employees. You don't want to re-use the ids, since that could cause confusion with off-line documents on past employees.

GUID (UUID for MySql: http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_uuid) is somewhat larger (36 char versus 4 bytes) and obviously a little less efficient to generate, but is otherwise just as good, and has a hugely much larger range (340,282,366,920,938,463,463,374,607,431,768,211,456 distinct ids, but not all of them will be available on a given server (so what?)). You don't need to worry about the small extra time to generate a UUID/GUID: It only happens once per employee.

In at least one circumstance, auto incremented id is better: You can discover what the most recent one was without doing a potentially costly SELECT to discover it (beware multiple threads though): http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_last-insert-id. That is not the case for UUID unless you generate it outside the database and keep track of it there.

Thanks griswolf, thats a relief. Auto inc will suffice. One more requirement, is that an employee should be archived after 5 years, as this system is only for new hires (<5 years). I have not done dealt with any archiving with DBs before. To determine that, employee has hire date attribute.
What is the best way of handling this? I was thinking run a job or something every month or so?
Thanks once again, for helping me out. Learning so much in no time.

There are two basic choices for termination of a record:

  1. Remove the record
  2. Mark the record as no longer active

Historians/archivists will strongly prefer option 2; people concerned about possible breaches will strongly prefer option 1. I'm more of an archivist, myself, so I'd tend to have yet another record, boolean, named something like 'current' or 'active' defaulting to 1/True. The alternative (removal) is probably best done by backing up the entire database, then DELETEing records that meet the various criteria. Because of the join tables, you need to do this thoughtfully.

There are probably other criteria for termination of this stuff: No longer employed, out of age, promoted away, ... so you may want a 'status' field too, regardless of whether you mark or delete.

As you suggest, a cron job that sweeps through the table periodically is one of the right ways to do termination. For a table this size you could quite reasonably do it daily, but if monthly is enough, go for that option: Less work usually better... except when you forget to monitor and something goes wrong. Seldom done things just get less attention.