"Write a program to predict, on average, how many tracks will be played before any of the previously played tracks is repeated" sorry thats the actual wording of the Q.

i understand the 1/1000 chance then theres another 1/1000 chance of getting the song again next time, meaning 1/1000 X 1/1000 chance of getting the same song twice in a row using the shuffle.

not sure about the " how many tracks" part though!!

**i understand the 1/1000 chance then theres another 1/1000 chance of getting the song again next time, meaning 1/1000 X 1/1000 chance of getting the same song twice in a row using the shuffle.**

That's an analysis of not getting a SPECIFIC song twice in a row in two sequential playings. It's like saying "What are the odds of getting song number 17 twice in row straight away when I press play?". You don't care about a SPECIFIC song, and you don't care about WHEN they are repeated.

Probability is commonly counter-intuitive. Here's a take on the problem (and probability IS tricky - I stand by to be corrected...):

Given a first song X, what is the probability that it is not the same as any previous song? 1

Given that first song, what is the probability that the next song will not be the same? 999/1000.

What is the probability that the next song will not be either of the first two? 998/1000.

What is the probability that the next song will not be any of the first three? 997/1000.

So the probability that none of the first four are the same is:

1 * 999/1000 * 998/1000 * 997/1000 = 0.994

The question becomes how far do you have to go before the product of these probabilities is equal to or less than 0.5?

Just as an interesting value, the chance of playing 1000 songs and not getting ANY repeats is about 4 x 10^(-433), which is pretty small :) And of course, if you go for 1001 songs with no repeats, the final value to be multiplied by is 0/1000, so the probability of playing 1001 songs with no repeats is exactly zero, which is what we intuitively expect.

An alternative way of doing this would be to actually run the situation a million billion times. Get a decent random number generator, and start churning out value between 1 and 1000. Each time you get a number that you already had, make a note of how many numbers you had to generate and start again. Once you've done it a million billion times, you'll have a pretty accurate average answer.

As an aside, I understand that in the early days Apple's random play algorithm on its music players were truly random. People complained about them not being random, and Apple changed the algorithm to a more deterministic one that people felt was more random.

is there no simple (probability never is!!) calculation out there?

Yes and no. The equation you're looking for is this, solved for y: http://nuclear.warhead.org.uk/rosy/CodeCogsEqn.gif

This is a problem really well suited to computation, which is why it was chosen, I expect.

thinking of using the random number, and giving each song a number, and then recording how many songs it takes until repeat, since that'll take into account the 'average'.

You certainly could do it that way. Be sure to repeat a million billion times to get a good value :) As histrungalot says, make sure it's a uniform distribution (i.e. equal chance of every value coming up each time, not clustered around some central point).

The big symbol on the left indicates a product. You can read it as:

The product, from x=1000 to x=y, of (x/1000), equals 0.5

You can expand it as:

1000/1000 * 999/1000 * 998/1000 * ..... * y/1000 = 0.5

Note that with an integer value of y you're unlikely to get exactly 0.5, so you have to find the value of y for which the value comes out closest and interpret that yourself.

The code to solve this is quite simple. Something along the lines of:

```
double product = 1;
double x = 1000;
while (product > 0.5)
{
product = product * ( x / 1000);
x--;
}
number_of_songs_until_probability_is_50_percent = 1000 - x;
```

I've not checked this, so be careful that it stops at the right place (isn't off by one or some such) and so forth, but it's this sort of thing.

It might be quite neat to use both methods; you can use the multiple runs method to check the calculated answer, and produce neat little graphs and so forth showing how many runs it takes to get better and better answers.

well got sort of a briefing into this, and the answer is less than 50 songs! i need 2 prove this just!

was thinking of picking say song 9, then have a random number generator in loop and when song 9 comes out again, stop the loop and do another loop til the same happens, does that sound about right?

was thinking of picking say song 9, then have a random number generator in loop and when song 9 comes out again, stop the loop and do another loop til the same happens, does that sound about right?

No, that sounds wrong. That will be testing the chance of song 9 coming up twice, given that the first song is song 9. That's not the question. You're supposed to be checking the probability of ANY song coming up twice, and you don't care where it comes out first in the sequence (could be the first song out, could be the 10th song out, could be the 50th song out, you don't care, you'll just stop as soon as the most recent song matches ANY previous song).

If you used your suggestion, and the songs that came up were song 17, 17, 17, 17, 17, 17 (OMG that's the same song six times in a row!), 9, you would say that it took 7 loops to get the same song twice. Clearly not true.

I thought I explained this all pretty clearly. You can calculate the exact value using the equation I gave you, which is perfectly suited to computation. It will require a single loop construct.

To do it experimentally, STOP thinking about code first. STOP. Imagine you had a big bag of tiles, and on each tile is a number, from 1 to 1000. You are to pick one out, and read the number, and then put it back in, and then pick one out, and read the number, and then put it back in, over and over. You stop when you have seen the same number twice. On average, how many times will you pull a tile out of the bag?

THAT is what you are modelling. THIS is the actual difficulty in programming - understanding the problem and being able to think about it in a way that lends itself to computation; THEN comes translating that problem into computation using the syntax available to you.

Now that I've gone back to look at my equation, I realise that this is just the "Birthday Paradox" with a year of 1000 days, so to speak :)

Wikipedia has a good explanation and a number of equations that come out with the answer I got earlier, which is nice.

**So, the question I have is "average number of songs played before there is a repeat" is the 50% mark? **

Yes, I think so. There is a number of songs to play at which the probability of hearing a repeat is 50% (well, just over); the question becomes what is that number?

I've done calculation and experiment now, and they come out with the same value most of the time (I did use a better random number generator than the one I found in cstdlib, and even then I had to let it run a lot of times); every so often, the experiment is off by a couple. That's experimentation for you! :)

I'm late on this but here is my $0.02 anyway. Moschops is right that this is the same as the Birthday Problem.

However, this is NOT a C++ problem, or even a computer programming problem. It is a probability problem. The probability solution involves dividing the 1000 songs into 2 groups. In the first group is the song to be repeated, and the second group has all the other songs. Every time the song player selects a song it can either choose from the first group with probability 1/1000 or the second group with probability 999/1000. The specific problem seems to be to find the number of songs the player selects before it chooses from the first group twice.

This defines the binomial probability. The exact solution to this type of problem is Prob(first group selections equal to 2) = [n!/2!(n-2)!] times (1/1000) raised to power 2 times (999/1000) raised to power (n-2). Find the n that makes the left hand side equal to 1.0.

I think the prof should have assigned how to program the solution rather than find the solution.

In the first group is the song to be repeated, and the second group has all the other songs. Every time the song player selects a song it can either choose from the first group with probability 1/1000 or the second group with probability 999/1000. The specific problem seems to be to find the number of songs the player selects before it chooses from the first group twice.

This is an incorrect assessment of the problem. You have assessed the problem in which a *specific *song is repeated. This is the same as PJH's suggestion above of picked song 9 to be the repeated song. The question does not concern a *specific *song to be repeated. The question is about repeating *any *song.