Hi guys,

Well, it's World Cup time and it seems only fitting that I submit something related to that. I was intrigued by the whole "Group of Death" concept. Intuitively (for me), a group of death should be a very unlikely event - yet they are always found in the World Cup. On talking to a mathematician, I was fascinated to find that they are supposedly unsurprising and quite predictable. Unconvinced, I decided to have a stab at simulating it. Here it is:

/* 
 * File:   main.cpp
 * Author: daniel
 * Group of Death Simulation
 *
 * Created on June 15, 2010, 3:34 PM
 */

#include <stdlib.h>
#include <vector>
#include <algorithm>
#include <iostream>
#include <ctime>
#include <cstdlib>

using namespace std;

int main(int argc, char** argv) {

    // variable to record total groups of death for each run
    int groupTotal = 0;

    // holds total for number of death groups of overall simulation
    int deathGroup = 0;

    // random seed - how good is this I wonder?

    srand(unsigned(time(NULL)));

    // array to populate the vector
    
    int box[32] = {0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3};

    // vector to store and randomize values

    vector< int > group(box, box + 32);

    // three nested loops - outermost is number of simulations, inner one is groups, inner most is teams

    for(int i = 0; i < 10000; i++)
    {
        // randomize team rankings which are between 0 and 3

        random_shuffle(group.begin(), group.end());
        
        for(int j = 0; j < 8; j++)  // number of groups
        {
            groupTotal = 0;

            for(int k = 0; k < 4; k++)  // number of teams
            {
                groupTotal += group.back();
                group.pop_back();
            }
            if(groupTotal > 8)  // threshold for group of death
            {
                deathGroup++;
            }
        }
        
        group.assign(box, box + 32);
    }

    cout << "Number of groups of death: " << deathGroup << endl;

    return (EXIT_SUCCESS);
}

Now, after a fair bit of mucking around (with simulations at 100, 10000 up to 100000000 times) I get a figure around 98 - 99%!?!? Is something going drastically wrong somewhere. Surely it can't be that likely. I should add that my definition of a group of death is a total greater than 8. Thanks in advance.

Daniel

PS: Maths was never my forte, so please be gentle!

I have a feeling that most of us geeks here are going to need a better explanation of what these rankings (0-3) mean, and a bit of an English explanation of what is going on before we can look at your code simulation.

er... i dont know too much abt srand() but how about this...

you interchange say the 30th and 31st member in the array... guess what the groups are still identical because the last four still form the same group (though in different order)... there'd probably be more such logical errors i think..i dunno...

And you should probably rank them 1,2,3,4 as 0th rank team has no value according o your summation...(cumon the teams cant be non-existent in a group!!)

but er in reality... the teams are first grouped into different baskets according to their ranks and one team from each basket is chosen for each group so that "group of death" scenarios dont occur too many times... so the probability is MUCH MUCH lesser!!

Hmm...interesting. I will look into the selection process more closely and respond to this Crak. My point is that it is seemingly almost impossible to use different baskets to get that very effect. More thought required on my part, I think. Thanks for the interest in this thread.

Daniel

This has tow issues : (a) the code (b) your expectations of the output.

First the code:

Since you are re-shuffling the code each time, you don't need to re-assign the groups,
just re-shuffle.
It is much quicker to do it this way:

for(int i = 0; i < 10000; i++)
    {
      // randomize team rankings which are between 0 and 3
      random_shuffle(group.begin(), group.end());
      int index(0);
      for(int j = 0; j < 8; j++)  // number of groups
        {
	  groupTotal = 0;
	  for(int k = 0; k < 4;index++, k++)  // number of teams
            {
	      groupTotal+=group[index];
            }
	  if(groupTotal > 8)  // threshold for group of death
            {
	      deathGroup++;
            }
        }
    }

(b) Now there is a problem, you are adding an extra group of deathGroup for EACH group of death that is found, ie. you test 10000 * 8 groups and find about 10000 groups of death. THAT does not test the hypothesis. You want to test if there are any groups of death within a particular set of eight. It doesn't matter if there are two/three etc in a particular world cup grouping.

So we add a break; after the deathGroup++ line. Then the number is
near the correct answer of about 75%. i.e. there is a group of death in 3 out of 4 world cups.

Edited 6 Years Ago by StuXYZ: n/a

Just looked up how the world cup seeding works: It is actually that the top 7 nations + the hosts are seeded to avoid each other. So you have a situation were the sides with difficulty 3 are places into there own groups and then the others are seeded into the groups.

In that case the probability dependent on the host strength:
Host ==3 : 21%.
Host ==2 : 37%
Host ==1 : 44%
host ==0 : 46%

Interesting problem !

So given that SA are not exactly good, e.g. maybe 1, then we had a 44% chance of a group of death.

In the case that the host is a

StuXYZ, thanks for weighing on the issue! So what you are saying is that the probability of a Group of Death is dependent on the strength of the host nation?

Group of death is directly dependent on the host nation, it would not be if the host
nation didn't get put as the first team in the A group.

If the host nation is a cat 3 team, no cat 3 team can play each other, but if the
host nation is say a 1, then two things happen, a single 3 is going to either met the
host nation or one of the other 3s.

In re-thinking about the problem, I think that the deathGroup conditional is wrong.

I define a deathGroup in which more than 2 teams of quality >=2 exist in the group
e.g. 2220 is a death group but 3311 is not, since both 3's will advance but in the
first group a top 16 team will fail, (without playing below par).

So how to code that: simple add up the number of the higher teams:

for(int i = 0; i < 1000000; i++)
    {
      // randomize team rankings which are between 0 and 3
      random_shuffle(group.begin(), group.begin()+24);
      int index(0);
      for(int j = 0; j < 8; j++)  // number of groups
        {
	  int groupTotal = (group[24+j]>1);
	  for(int k = 0; k < 3;index++, k++)  // number of teams
	    groupTotal+=(group[index]>1);
	  
	  if(groupTotal > 2)  // threshold for group of death
            {
	      deathGroup++;
	      break;
            }
        }
    }

Note that care has to be taken with the inital setting of box to have the correct
value for the host.

In this case, a group of death is a near certainty, e.g. if the host is 3, then
it is still a 99% chance, and it gets slightly more favourable if the host is less.

Further inquiry asked the question what if a group of death is that 2 teams with a rating over 2 have to go home. Then we have 27.8% [host is a 0/1] and 21.5% [host is 2/3].

Very interesting simulations can be contemplated if the concept is to apply probabilities to teams and so on... But that is getting a bit off topic.

Edited 6 Years Ago by StuXYZ: n/a

Just to comment on the question in your comment comment

// random seed - how good is this I wonder?

When I did Monte Carlo we preferred to use a definite seed for 2 reasons:

1. You can reproduce your results.

2. If you want to improve the accuracy of your result you know which random numbers to avoid (because you have already calculated the result).

We could work out the probability analytically if we wanted to, could we not?

I have to say that the mathematics is now a bit beyond me. However, I appreciate all of the input and have realized that my algorithm is, of course, wrong - I am working out how many groups of death are generated, rather than whether or not a group of death is present in any particular world cup (set of eight groups). That makes sense to me. StuXYZ, you hinted at something:

Very interesting simulations can be contemplated if the concept is to apply probabilities to teams and so on... But that is getting a bit off topic.

What's an example of the sort of thing you had in mind?

This question has already been answered. Start a new discussion instead.