Hi everyone

Hopefully I can describe this issue properly. For ease I'm going to use a hypothetical situation. Lets say I'm working on a program that tracks the number of times 2 employees have worked together.

Here's a week example:
Mon - Tom, Bob, Al
Tue - Tom, Al, Cindy
Wed - Al, Cindy
Thur - Tom, Bob
Fri - Bob, Al

I have a table called Employees:

id (pk)

| id | name  |
| 1  | Tom   |
| 2  | Bob   |
| 3  | Al    |
| 4  | Cindy |

My question is what is the best design to create a many to many relationship on this one table? Here are my thoughts:

Create a reference table which includes a column that increments each time those employees work together:

Employee_Reference (could probably use a better name)
Emp_id_1 (fk)
Emp_id_2 (fk)

Emp_id_1 and Emp_id_2 would both reference the employees id. The problem I see with this is that the table represents the exact same data, thus being redundant:

| Emp_id_1 | Emp_id_2 | Count |
| 1        | 2        | 2     |
| 2        | 1        | 2     |

My other thought was to use the exact same reference table layout, but place some restrictions in my business logic where emp_id_1 will always be the smaller of the 2 id fields. That would reduce the redundancy.

I realize that i could have another table where i actually have the work shifts, and I could write a query to group and count records etc, but I really have no need for the specific work shift data, and by having the count in it's own table, will increase performance for me as I'll be querying this table quite often.

You should have table called tbl_work (ID_EMP1 #, ID_EMP2 #) that's if you need to restrict 2 employees work together more than 4 times. Actually I don't understand you well but I'll give solution to what I got.
If you need system tracks whom\when employees work together and remove redundancy you can use this table tbl_work(ID_EMP1 #, ID_EMP2 #, WorkDate #).
Let's discuss that to understand you well

Hey, thanks for the reply.

I have thought about setting up a table like this:

| Emp_id_1 | Emp_id_2 | date   |
| 1        | 2        | 1/1/08 |
| 1        | 3        | 1/1/08 |
| 2        | 3        | 1/1/08 |
| 1        | 3        | 1/2/08 |
| 1        | 4        | 1/2/08 |
| 3        | 4        | 1/2/08 |
| 3        | 4        | 1/3/08 |
| 1        | 2        | 1/4/08 |
| 2        | 3        | 1/5/08 |

If I wanted to know how many times Tom and Bob worked together I could run a query like this:

Select count(*)
From Employee_Work
Where (emp_id_1 = 1 And Emp_id_2 = 2)
Or (emp_id_1 = 2 And Emp_id_2 = 1)

My concerns are:
1) I really don't care about the dates they worked together, I just want to know how many times they worked together. So I really don't need the Date column and I could run the same query.

2) Wouldn't the query perform much faster If I were just returning a value stored in the database (the count of the number of times 2 people have worked together) instead of running the count function in the SQL? Maybe I'm being too concerned, and the performance difference is nil.

I really don't care about the dates they worked together, I just want to know how many times they worked together. So I really don't need the Date column and I could run the same query.

You really really really need it, tell me why because if you don't attach it and make the three columns primary key you enforced just 2 employees can work 2 or 4 times with each other? how come and watch out
Emp1# Emp2# Date
1 2 1/1/08 (√)
1 2 1/8/08 (X)
2 1 1/5/08 (√)
2 1 1/5/08 (X)

But if you do it like that Emp1#, Emp2# and Date#
Emp1# Emp2# Date#
1 2 1/1/08 (√)
1 2 1/8/08 (√)
2 1 1/5/08 (√)
2 3 1/5/08 (√)

DATABASE DESIGN PRINCIPLE DON'T STORE ANY CALCULATED VALUE, Use SP to calculate it, faster, more performance, etc....

I have a number of points here.
1) If each row in the EMPLOYEE_WORK table has a ROWID IDENTITY column (or similar) then you don't need the date column. INT columns are smaller that DATETIMES as well.

2) Using your defined EMPLOYEE_WORK table is fine for two or three employees but will very quickly become unuseable for many more.

3) Remember you have to populate the EMPLOYEE_WORK table. Every time you update any work records for the employees themselves.

4) I don't understand why you don't want to just select from the source table holding the employees work records. That would cope with as many employees as you have and remove the need to keep populating the Employee_work table.

No Ramy, just my comments on the DB design in this thread.

Sorry Ramy but you have some way to go before you become a fully skilled DBA. From reading your posts you have learned the fundamentals very well. You now need to get into a company position with an older, more experienced, DBA to mentor you to refine those skills.

I must say that I write this comment with trepidation. You have helped a large number of poeple with your posts on this site, far more that I have. I do not wish to discurage you assistance here.

commented: Thanks +6

I am very thankful to your words, really you prove for me, there still long way I should cut to become really skilled DBA\BI\Software Engineer.
Your words is more valuable than any reply I got in my entire life really I don't overdo, it really is.
And it doesn't mean I've 1000+ posts and you've 50+; I am better than you. Absolutely NOT
I'll open a thread asking how to be skilled DBA, and I am deeply in need for your help.

Thanks, pclfw