I've got 4 tables:
divisions (ID, Division)
teams (ID, Team)
scores(ID, RelationID, Score)
relatives (ID, RelationID, FieldID, ValID)

The relatives table is to 'link' divisions and teams by RelationID
The FieldID defines the table (1=divisions; 2=teams)
The ValID = the ID of the given table.
So, my fk is a combination of FieldID & ValID

The output should look something like this:

      Southwest Division | Tomcats  | 17
      Southwest Division | Bears    | 12
      Northeast Division | Tomcats  | 20
      Northeast Division | Lions    | 8
      Northeast Division | Cheetahs | 13
      Southeast Division | Cheetahs | 19
      Southeast Division | Lions    | 12
      Southeast Division | Zebras   | 6

Any given team can be in any division.

My Sql is:

    SELECT DISTINCT divisions.Division, teams.Team, AVG(scores.Score) AS Score 
    FROM (scores 
            INNER JOIN ((relatives INNER JOIN divisions ON relatives.FieldID = 1 AND relatives.ValID = divisions.ID) 
            INNER JOIN teams ON relatives.FieldID = 2 AND relatives.ValID = teams.ID) 
          ON scores.RelationID = relatives.RelationID) 
    GROUP BY Division, Team
    ORDER BY Division, Team

This sql returns no records. Even if I remove the answers, GROUP and ORDER BY I get no records.
Yes, the records are in the database. They're loaded for testing

How can I make the tables return what I need? I'd redesign the whole thing if needed.

Recommended Answers

All 24 Replies

The Teams table should contain all information unique to a team so why not include the Division as well. By adding the DivisionID to the Teams table you have created the relation without having to create a linking table. It is possible to take normalization too far. I also suggest that you avoid using a generic field name like ID in favour of an explicit name like DivisionID.

Divisions
    DivisionID
    Division    (division name)

Teams
    TeamID
    Team        (team name)
    DivisionID  (from Divisions)

Can you please provide more detail on the Scores and Relatives tables? Saying that The relatives table is to 'link' divisions and teams by RelationID doesn't really give any useful information. Your sample output has three columns. The first two are obvious (division and name) but the third is not. It's a number with no context. It doesn't appear to be a score as a score would have a value for each of the two teams. Perhaps if you said what you want the query to do. If it turns out to be complicated then perhaps you could post the exported tables as CSV files so I can import them locally and play with them.

Again, any team can be affiliated with any division. A DivisionID in the teams table is useless here. That's my problem.

I generally use RecordID for primary keys. Fk is generally something like DivisionID, but that's irrelavent here.

Scores are not the results of games, they are score based answers.
I've got a non-disclosure here to deal with so please bear with me.

If you looked at my Sql you'd see that the number is an average of the scores given by the entity(s) (team, division, project, company, etc.).
The lay out must be interchangable, (ie. Divisions per team).
The result should look like:

Southwest Division | Pneumatics    | 17
Southwest Division | Refrigeration | 12
Northeast Division | Pneumatics    | 20
Northeast Division | Underground   | 8
Northeast Division | Networking    | 13
Southeast Division | Networking    | 19
Southeast Division | Underground   | 12
Southeast Division | Efg           | 6 

I've got a dump file but it won't let me attach it here.

SET FOREIGN_KEY_CHECKS=0;
-- ----------------------------
-- Table structure for divisions
-- ----------------------------
DROP TABLE IF EXISTS `divisions`;
CREATE TABLE `divisions` (
  `RecordID` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `Division` varchar(50) DEFAULT NULL,
  `boolIsActive` tinyint(4) unsigned DEFAULT '1',
  PRIMARY KEY (`RecordID`),
  UNIQUE KEY `Division` (`Division`)
) ENGINE=MyISAM AUTO_INCREMENT=6 DEFAULT CHARSET=latin1;

-- ----------------------------
-- Table structure for relations
-- ----------------------------
DROP TABLE IF EXISTS `relations`;
CREATE TABLE `relations` (
  `RecordID` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `RelationID` int(11) unsigned DEFAULT NULL,
  `FieldID` int(11) unsigned DEFAULT NULL,
  `ValID` int(11) unsigned DEFAULT NULL,
  PRIMARY KEY (`RecordID`)
) ENGINE=MyISAM AUTO_INCREMENT=17 DEFAULT CHARSET=utf8;

-- ----------------------------
-- Table structure for scores
-- ----------------------------
DROP TABLE IF EXISTS `scores`;
CREATE TABLE `scores` (
  `RecordID` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `RelationID` int(11) DEFAULT NULL,
  `Score` int(11) DEFAULT NULL,
  PRIMARY KEY (`RecordID`)
) ENGINE=MyISAM AUTO_INCREMENT=13 DEFAULT CHARSET=utf8;

-- ----------------------------
-- Table structure for teams
-- ----------------------------
DROP TABLE IF EXISTS `teams`;
CREATE TABLE `teams` (
  `RecordID` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `OrgID` int(11) DEFAULT NULL,
  `Team` varchar(50) DEFAULT NULL,
  `boolIsActive` tinyint(4) unsigned DEFAULT '1',
  PRIMARY KEY (`RecordID`),
  UNIQUE KEY `Team` (`Team`,`OrgID`)
) ENGINE=MyISAM AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;

-- ----------------------------
-- Records 
-- ----------------------------
INSERT INTO `divisions` VALUES ('1', 'Div One', '1');
INSERT INTO `divisions` VALUES ('2', 'Div Two', '1');
INSERT INTO `divisions` VALUES ('3', 'Div Three', '1');
INSERT INTO `divisions` VALUES ('4', 'Div Four', '1');
INSERT INTO `divisions` VALUES ('5', 'Div Five', '1');
INSERT INTO `relations` VALUES ('1', '1', '1', '1');
INSERT INTO `relations` VALUES ('2', '1', '1', '2');
INSERT INTO `relations` VALUES ('3', '1', '1', '2');
INSERT INTO `relations` VALUES ('4', '1', '2', '1');
INSERT INTO `relations` VALUES ('5', '1', '2', '2');
INSERT INTO `relations` VALUES ('6', '2', '1', '2');
INSERT INTO `relations` VALUES ('7', '2', '1', '3');
INSERT INTO `relations` VALUES ('8', '2', '1', '4');
INSERT INTO `relations` VALUES ('9', '2', '2', '2');
INSERT INTO `relations` VALUES ('10', '2', '2', '3');
INSERT INTO `relations` VALUES ('11', '3', '1', '1');
INSERT INTO `relations` VALUES ('12', '3', '1', '3');
INSERT INTO `relations` VALUES ('13', '3', '1', '4');
INSERT INTO `relations` VALUES ('14', '3', '2', '1');
INSERT INTO `relations` VALUES ('15', '3', '2', '3');
INSERT INTO `relations` VALUES ('16', '3', '2', '5');
INSERT INTO `scores` VALUES ('1', '1', '6');
INSERT INTO `scores` VALUES ('2', '1', '5');
INSERT INTO `scores` VALUES ('3', '1', '7');
INSERT INTO `scores` VALUES ('4', '1', '6');
INSERT INTO `scores` VALUES ('5', '2', '4');
INSERT INTO `scores` VALUES ('6', '2', '5');
INSERT INTO `scores` VALUES ('7', '2', '7');
INSERT INTO `scores` VALUES ('8', '2', '5');
INSERT INTO `scores` VALUES ('9', '3', '6');
INSERT INTO `scores` VALUES ('10', '3', '5');
INSERT INTO `scores` VALUES ('11', '3', '7');
INSERT INTO `scores` VALUES ('12', '3', '7');
INSERT INTO `teams` VALUES ('1', null, 'My Team One', '1');
INSERT INTO `teams` VALUES ('2', null, 'My Team Two', '1');
INSERT INTO `teams` VALUES ('3', null, 'My Team Three', '1');
INSERT INTO `teams` VALUES ('4', null, 'My Team Four', '1');
INSERT INTO `teams` VALUES ('5', null, 'My Team Five', '1');
INSERT INTO `teams` VALUES ('6', null, 'My Team Six', '1');

any team can be affiliated with any division

is not the same thing as any team may be affiliated with more than one division.

I generally use RecordID for primary keys.

Your OP specified ID, not RecordID. Your initial query also uses ID as a field name. Your query uses the table name, relatives but your actual table name is relations.

If you looked at my Sql you'd see that the number is an average of the scores given by the entity(s) (team, division, project, company, etc.).

I should not have to infer that by looking at the SQL. You should explain it. This is the first time you have mentioned project or company. I think the first thing you have to do is decide on consistent names for everything. Can you please explain what you are trying to do because so far none of this is making sense?

If you can't understand the problem, I'm sure you can't help me.
I'm really not looking for an argument. Thanks:)

I wasn't looking for an argument. I was looking for more information so I could offer help. It is possible I still won't be able to help but unless you take the time to explain the problem clearly we'll never know. And if I can't help perhaps someone else can.

Sorry, I thought the dump file would suffice for testing. That's what we're using. If I can make this work I can make the whole app work.
I'm under a non-disclosure and don't want to get sued.
If you want to change the PK and/or add FKs to the tables in the dump, feel free.

I neglected to point out that more than 1 division and/or more than 1 team can be affiliated with 1 score.
The result should show DISTINCT records though. Each division should be unique and each team under a given division should be unique and the 'score' should be the average of all answers for the given divsion/team.

Yes, there are more potential tables but for purposes of testing, the ones in the dump should suffice. If I can make that work, I think I'm home free.

I understand that you can share only so much because of non-disclosure but unfortunately what you are able to share is not enough for me to figure out what you are trying to do. Sorry I can't offer any suggestions.

If you load the test tables from the dump you can use that to try to produce what I need. Conceptually, it's not that complex. In sql it's a nightmare for me so far.

I need to be able to retrieve a table structure as shown above from the records.
Any team can be 'affiliated' with more than one division in a given relation and visa-versa.
More than one division and/or team can be 'affiliated' with 1 relation.
A relation can be used for more than one score.

I don't really care what the specific tables are named or what PK -> FK combinations might be used. I'd completely redesign the tables if I need to.

I need to be able to produce the results from the table structure.

Member Avatar for iamthwee
INSERT INTO `divisions` VALUES ('1', 'Div One', '1');
INSERT INTO `divisions` VALUES ('2', 'Div Two', '1');
INSERT INTO `divisions` VALUES ('3', 'Div Three', '1');
INSERT INTO `divisions` VALUES ('4', 'Div Four', '1');
INSERT INTO `divisions` VALUES ('5', 'Div Five', '1');
INSERT INTO `relations` VALUES ('1', '1', '1', '1');
INSERT INTO `relations` VALUES ('2', '1', '1', '2');
INSERT INTO `relations` VALUES ('3', '1', '1', '2');
INSERT INTO `relations` VALUES ('4', '1', '2', '1');
INSERT INTO `relations` VALUES ('5', '1', '2', '2');
INSERT INTO `relations` VALUES ('6', '2', '1', '2');
INSERT INTO `relations` VALUES ('7', '2', '1', '3');
INSERT INTO `relations` VALUES ('8', '2', '1', '4');
INSERT INTO `relations` VALUES ('9', '2', '2', '2');
INSERT INTO `relations` VALUES ('10', '2', '2', '3');
INSERT INTO `relations` VALUES ('11', '3', '1', '1');
INSERT INTO `relations` VALUES ('12', '3', '1', '3');
INSERT INTO `relations` VALUES ('13', '3', '1', '4');
INSERT INTO `relations` VALUES ('14', '3', '2', '1');
INSERT INTO `relations` VALUES ('15', '3', '2', '3');
INSERT INTO `relations` VALUES ('16', '3', '2', '5');
INSERT INTO `scores` VALUES ('1', '1', '6');
INSERT INTO `scores` VALUES ('2', '1', '5');
INSERT INTO `scores` VALUES ('3', '1', '7');
INSERT INTO `scores` VALUES ('4', '1', '6');
INSERT INTO `scores` VALUES ('5', '2', '4');
INSERT INTO `scores` VALUES ('6', '2', '5');
INSERT INTO `scores` VALUES ('7', '2', '7');
INSERT INTO `scores` VALUES ('8', '2', '5');
INSERT INTO `scores` VALUES ('9', '3', '6');
INSERT INTO `scores` VALUES ('10', '3', '5');
INSERT INTO `scores` VALUES ('11', '3', '7');
INSERT INTO `scores` VALUES ('12', '3', '7');
INSERT INTO `teams` VALUES ('1', null, 'My Team One', '1');
INSERT INTO `teams` VALUES ('2', null, 'My Team Two', '1');
INSERT INTO `teams` VALUES ('3', null, 'My Team Three', '1');
INSERT INTO `teams` VALUES ('4', null, 'My Team Four', '1');
INSERT INTO `teams` VALUES ('5', null, 'My Team Five', '1');
INSERT INTO `teams` VALUES ('6', null, 'My Team Six', '1');

So using this dataset show us what your expected results should be?

The teams table amounts to a table of values. The divisions table amounts to a table of values.
There is no implicit connection between teams and divisions.
A person enters the score and attributes the score to a combination of divisions and/or teams.

Multiple divisions may be 'attributed to' a given score. Multiple teams may also be 'attributed to' a given score.
So a given score could be 'attributed to' 2 divisions and 3 teams (ie a 'relationship').
Further, more than 1 score may be attributed to the same 2 divisions and 3 teams (ie the same relationship).

I need to be able to produce reports that show for each division, for each team what the average score was.
I need to be able to show it as teams per division OR divisions per team.

Member Avatar for iamthwee

Yeah I know but given that exact dataset above can you post the exact results you want, that would help.

The teams table amounts to a table of values. The divisions table amounts to a table of values.

That statement is meaningless. Every table in every database is a table of values. One of the problems is that I have no idea what the values mean. For example:

relatives (ID, RelationID, FieldID, ValID)    
sample record is ('4', '1', '2', '1')

You have said that The FieldID defines the table (1=divisions; 2=teams) and valID is the primary key of the table defined by FieldID but I still have no idea what information is represented by this record.

scores(ID, RelationID, Score)
sample record is ('5', '2', '4')

Same here. I have no idea what the above record indcates.

Normalization can be taken to excess. Every increase in normalization results in an increase in the complexity (number of joins) required to do even simple queries. See here for an example. In the example of a generic social network database, a query to retrieve the info for one user looks like

select * from Users u
inner join UserPhoneNumbers upn
on u.user_id = upn.user_id
inner join UserScreenNames usn
on u.user_id = usn.user_id
inner join UserAffiliations ua
on u.user_id = ua.user_id
inner join Affiliations a
on a.affiliation_id = ua.affiliation_id
inner join UserWorkHistory uwh
on u.user_id = uwh.user_id
inner join Affiliations wa
on uwh.affiliation_id = wa.affiliation_id

That's pretty butt ugly. Imagine what a complex query might look like. I'm in favour of normalization but not to the point where it makes the queries unmaintainable, or even possibly unwritable. I've had developers (who inherited code) come to me with queries that they want simplified. Unfortunately, in several cases, nobody knew what the query was supposed to do and looking at the query didn't help.

The client fills the divisions and teams tables with whatever values they want. We have no idea what they will be. The tables represent entities.

The PK in the relations table & scores table is not used.
I just always use a PK for all tables. Maybe I should remove them?

The values in the relations table could be saved into the scores table but a given relationship is often used for more than one score.
A 'relationship' is just a combination of specific entities (ie teams, divisions).
So a given 'relation' can have more than one record (ie 2 divisions & 3 teams).

So, relations('4', '1', '2', '1') represents this:
4: PK, not used for anything
1: The first relation that has been saved.
2: The table being represented here is the teams table
1: The team is the first team saved. (ie. teams.RecordID)

answers('5', '2', '4') represents this:
5: the PK, not being used for anything
2: the second relation saved
4: the score given by the person who answered the question was a 4

I'm trying real hard to keep this from turning into a nasty join. I'm willing to do something like create temporary tables or whatever I need. The data has to be live though (overnight saves won't help).

The client fills the divisions and teams tables with whatever values they want. We have no idea what they will be

It's obvious what information goes into the divisions and teams tables. These tables map division and team names to IDs. These two tables are not the problem.

The tables represent entities.

Again, this is another meaningless statement. That definition can apply to any table in any database. If I ask you what 5 is you can say that it is a value. If I ask for more information you can say that it is a numeric value. That doesn't clarify anything. What is required is that you say what the number represents (please don't say it represents the number 5). Saying it represents a relation is also not adding any information. You have to say what that relation is. Let me give an example. If you provide me with "25:100" I have no idea what that means. If you then say it is a ratio I still have no idea what it means. If you continue with "it's a ratio of the number on the left to the number on the right" I am still in the dark. If, however, you say that it represents the proportion (percentage) of American who believe in evolution then I finally have a useful number. Disturbing, but useful.

So, relations('4', '1', '2', '1') represents this:
4: PK, not used for anything
1: The first relation that has been saved.
2: The table being represented here is the teams table
1: The team is the first team saved. (ie. teams.RecordID)

But what does that mean? You have a record that identifies the teams table and a specific team within that table. You already have that information in the teams table.

answers('5', '2', '4') represents this:
5: the PK, not being used for anything
2: the second relation saved
4: the score given by the person who answered the question was a 4

Now you have introduced the concept of person out of the blue. This table, therefore, has nothing to do with the other tables you mentioned. And even if it did I am still in the dark as to what it means.

I think part of the problem is that divisions and teams don't mean any more to me than to you. Conceptually you'd think that a division represents a geographical area and a team is a group of people all working together.
I can't say that. They mean whatever they mean to our client. They're just 'things' from my perspective. The sad truth is, there are currently 15 'entity' tables representing 'things' that could be involved with a piece of work and the client can create as many more as they like. They are just 'things' all involved with a piece of work. I can't describe them any more specifically than that.

The person is NOT in any table here, they are simply people who are asked to answer questions on a piece of 'work'. The 'work' involves all the 'entities' that the client has selected for that work. The person answers the question with a score.
Quote from above:
"A person enters the score and attributes the score to a combination of divisions and/or teams."

Originally I thought I'd be able to describe just my specific problem (which I understand very clearly) and seek a solution. I never expected to have to produce an entire conceptual specification of the application.

It might help if you could tell me what the original query is supposed to do in English. Note that providing sample output is not the same thing as describing what you want.

I haven't uploaded your query dump yet, but one thing that causes an issue is that you are inner joining first on (FieldId = 1) and later on on (FieldId = 2). That second condition is no longer valid, as the first inner join only selects (FieldId = 1). Hence no results.

I think if you split the relatives table into two, one for divisions and one for teams, it'll be easier to get what you need. Of course the scores table needs two FK's then, instead of the single ValId.

pritaeas thanks. That's the closest thing to a solution I've seen so far.
I never thought of a 'relations' table for each 'entity'. I should be able to maintain the RelationID across the tables which should permit multiples of each per relation. That way I don't need an FK in scores for each entity. I'll try it both ways (with FK & with RelationID only in scores) and see what happens.

The queries might be cumbersome but maybe I can dump partial results into a temp table to speed things up.

I've gotta do corporate taxes now but I'm looking forward to giving this a try.
I'll be sure to post what I end up with.

The only way this can work with the table you have, is to use your relatives table as the start for your query and then use left joins, so the different fieldID's will not be removed.

SELECT *
FROM Relatives
    LEFT JOIN Teams ON Teams.Id = Relatives.ValId AND Relatives.FieldId = 2
    LEFT JOIN Divisions ON Divisions.Id = Relatives.Id AND Relatives.FieldId = 1

Hmm, LEFT JOIN is what I really need anyway. For example, I need to show AVG(score) per division, per team, even if the DivisionID is null, as if to say 'no division'.

After seeing that the INNER JOIN didn't work I was afraid to try this because I feared it would not pull the records accurately.

I'll try that and see what happens.
Is there any documentation I can read that will clearly explain how MySql actually reads multiple joins? I've read the MySql docs but they concentrate more on what you can and cannot do rather than how the database reads the records.

Not sure about any good docs, but the EXPLAIN functionality can help.

EXPLAIN helps. thanks.
As it turns out, I can't set ID fields in the scores table for each entity because I have to be able to accomodate multiple selections of any entity.

I also can't use LEFT JOIN because that will pull up every relation in the relations table.

It looks like the only feasable way to do this is to create a seperate relations table for each entity: ie rel_divisions_rl with a RelationID and an EntityID (ie RelationID, DivisionID) in each table.

This gives me exactly what I need. I tested it and compared it to the actual scores & relationships. It's right. It even handles situations where there was no division and/or no team and/or no organization.
All the averages are correct.

I added the table organizations to be able to make 3 tiers.
I've split the relations table up into multiple tables.
rel_divisions_rl(RelationID, DivisionID)
rel_teams_rl(RelationID, TeamID)
rel_organizations_rl(RelationID, OrgID)

If anybody knows how to make this faster without altering the results I sure would appreciate learning about it.

SELECT divisions.RecordID AS DivisionID, teams.RecordID AS TeamID, (organizations.RecordID) AS OrgID,  MAX(divisions.Division), MAX(teams.Team), MAX(organizations.OrgName), AVG(scores.Score) AS Score
FROM scores INNER JOIN (((relations
    LEFT JOIN (rel_divisions_rl LEFT JOIN divisions ON rel_divisions_rl.DivisionID = divisions.RecordID) ON rel_divisions_rl.RelationID = relations.RecordID)
    LEFT JOIN (rel_teams_rl LEFT JOIN teams ON rel_teams_rl.TeamID = teams.RecordID) ON rel_teams_rl.RelationID = relations.RecordID)
    LEFT JOIN (rel_organizations_rl LEFT JOIN organizations ON rel_organizations_rl.OrgID = organizations.RecordID) ON rel_organizations_rl.RelationID = relations.RecordID)
    ON scores.RelationID = relations.RecordID
GROUP BY DivisionID, TeamID, OrgID
ORDER BY DivisionID, TeamID, OrgID
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.