It is sometimes better to make things a little bit more complex than they need to be right now if you can foresee the possibility of needing that complexity. In particular, moving toward a normal form is often worth doing to avoid having to stop everything while you rebuild a database to include a few more fields in some table or other. For most apps, most of the time, it doesn't matter to shut the database down for a few minutes or even hours, but sometimes it does, and managers seldom understand the need. I think of it as "preventive programming" also known as "out-think the requirements". You should not spend more than a little extra effort though: You want to be efficient now, and look wise in the future when the boss says "how hard to do blah blah blah?" and you get so say: "it is already set up: Just give me the data."
griswolf
Veteran Poster
1,165 posts since Apr 2010
Reputation Points: 344
Solved Threads: 256
There is a plus side to storing phone numbers as alpha-numeric: Partial match is feasible. (Look for all the phone numbers in a particular exchange, or with a particular set of area codes or ...)
griswolf
Veteran Poster
1,165 posts since Apr 2010
Reputation Points: 344
Solved Threads: 256
That works.
I would rename `education` to `employee_college` which is just a habit of mine for naming join tables, and not a real requirement.
It would be possible to have
college_major(college_major_id, major_name, major_description, college_id)) and also
employee_major(employee_id,college_major_id)
Assuming most employees go to school 'nearby', then you would likely have multiple employees with the same major from the same college, a minor savings in space, but also more normal.
You may want to think about:How to deal with an employee who is registered, enrolled, done with a degree
What if the employee does not get a degree at all?
What if the employee has not declared a major?
griswolf
Veteran Poster
1,165 posts since Apr 2010
Reputation Points: 344
Solved Threads: 256
The number of potential auto incremented integers is sufficient to handle all but ridiculously large numbers of employees, even with significant turnover (32 bits: 4,294,967,295 distinct ids: http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html ). I would not worry about any deleted employees. You don't want to re-use the ids, since that could cause confusion with off-line documents on past employees.
GUID (UUID for MySql: http://dev.mysql.com/doc/refman/5.0/en/miscellaneous-functions.html#function_uuid ) is somewhat larger (36 char versus 4 bytes) and obviously a little less efficient to generate, but is otherwise just as good, and has a hugely much larger range (340,282,366,920,938,463,463,374,607,431,768,211,456 distinct ids, but not all of them will be available on a given server (so what?)). You don't need to worry about the small extra time to generate a UUID/GUID: It only happens once per employee.
In at least one circumstance, auto incremented id is better: You can discover what the most recent one was without doing a potentially costly SELECT to discover it (beware multiple threads though): http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_last-insert-id . That is not the case for UUID unless you generate it outside the database and keep track of it there.
griswolf
Veteran Poster
1,165 posts since Apr 2010
Reputation Points: 344
Solved Threads: 256
There are two basic choices for termination of a record:Remove the record
Mark the record as no longer active
Historians/archivists will strongly prefer option 2; people concerned about possible breaches will strongly prefer option 1. I'm more of an archivist, myself, so I'd tend to have yet another record, boolean, named something like 'current' or 'active' defaulting to 1/True. The alternative (removal) is probably best done by backing up the entire database, then DELETEing records that meet the various criteria. Because of the join tables, you need to do this thoughtfully.
There are probably other criteria for termination of this stuff: No longer employed, out of age, promoted away, ... so you may want a 'status' field too, regardless of whether you mark or delete.
As you suggest, a cron job that sweeps through the table periodically is one of the right ways to do termination. For a table this size you could quite reasonably do it daily, but if monthly is enough, go for that option: Less work usually better... except when you forget to monitor and something goes wrong. Seldom done things just get less attention.
griswolf
Veteran Poster
1,165 posts since Apr 2010
Reputation Points: 344
Solved Threads: 256