I always wondered how a programmer could efficiently generate a random number from a set* of numbers once and only once.

I quickly realized the inefficient solution. Simply generate the number and if it equals one you have all ready generated then regenerate the number.

Regenerating the number is a poor method. Especially in a small set or if you are going to exhaust the elements in the set.

Today I realized a smarter solution. Instead of using random to generate a number that is in the set, use random to generate the index of a element in the set and then return that value and remove the element from the set.

Now I'm just wondering what the best way to implement this is.

It would be relatively simple to implement it in a function, but part of the procedure is modifying the list in question. Python documentation labels modifying lists within a function as a bad practice. I just don't understand why. I also can't understand how to get around this either at the moment.

* By set, I don't necessarily mean Python's built in set object. This would probably be implemented using a mutable sequence.

Recommended Answers

All 13 Replies

Actually, it would probably be better to create a new class that inherits another sequence. This might allow this new object to use all of the old methods of the inherited object and have a new method for returning and removing a element of the sequence.

Why don't you just use shuffle ?

from random import shuffle

def random_unique(sequence):
    L = list(sequence)
    shuffle(L)
    return iter(L)

for x in random_unique(range(10)):
    print(x)

Why don't you just use shuffle ?

from random import shuffle

def random_unique(sequence):
    L = list(sequence)
    shuffle(L)
    return iter(L)

for x in random_unique(range(10)):
    print(x)

There are more permutations for a sequence than there are indexes in a sequence. The total number of permutations for even a small sequence can exceed the period of most random number generators. While this probably has little impact for a person merely seeking to rearrange a sequence in place, when you want to obtain only one value each value in the sequence would not have an equally probable chance of being at the point which you sample the resulting permutation.

I don't understand the argument. If you succeed in writing your random generator differently, it will be an alternative to shuffle. Can you explain more precisely the probabilistic argument ? I think that with shuffle, each item has the same probability to be at a given position in the generated sequence.

I don't understand the argument. If you succeed in writing your random generator differently, it will be an alternative to shuffle. Can you explain more precisely the probabilistic argument ? I think that with shuffle, each item has the same probability to be at a given position in the generated sequence.

Because the number of permutations exceeds the period of the random number generator, some sequences will never be returned by the function.

Because the number of permutations exceeds the period of the random number generator, some sequences will never be returned by the function.

I still don't agree. It's very easy to write a shuffle function

from random import randrange

def my_shuffle(the_list):
    n = len(the_list)
    while n > 1:
        i = randrange(n)
        the_list[i], the_list[n-1] = the_list[n-1], the_list[i]
        n -= 1

L = range(10)
my_shuffle(L)
print(L)

Shuffle generates only one permutation. Now since each permutation of n objects is the product of at most n transpositions, you only need to generate at most n random transpositions in a set of order O(n^2) possible transpositions.

I still don't agree. It's very easy to write a shuffle function

from random import randrange

def my_shuffle(the_list):
    n = len(the_list)
    while n > 1:
        i = randrange(n)
        the_list[i], the_list[n-1] = the_list[n-1], the_list[i]
        n -= 1

L = range(10)
my_shuffle(L)
print(L)

Shuffle generates only one permutation. Now since each permutation of n objects is the product of at most n transpositions, you only need to generate at most n random transpositions in a set of order O(n^2) possible transpositions.

So what you're saying is that as the number of permutations exceeds the period of Python's random number generator, each index has an equally probable chance of being relocated to the beginning? Can you prove this?

So what you're saying is that as the number of permutations exceeds the period of Python's random number generator, each index has an equally probable chance of being relocated to the beginning? Can you prove this?

I think I understand your point now. There must be a limitation somewhere when n is large. However, note that the builtin shuffle function takes an optional argument 'random' which allows you to pass your own random generator. So I still believe that the best method is to use shuffle, but for large n, write your own random generator with a larger period. An interesting point would be to have an order of magnitude of the n for which this problem arises.

It confuses me. I've never had to understand random number generation and pseudo random number generation before. I might be wrong and you might be right. There might be nothing mathematically wrong with shuffle.

However, I do think it is a fair idea that given two options for accomplishing something, it is probably better to use the option that is conceptually simpler if you have trouble understanding the more complex one or if you have trouble understanding both.

Of course, after reviewing the list object, I think the point might be moot. If your goal is to obtain a random element from a list and then remove it, then it would be simplest to use the list.pop(index) method where index is a randomly generated number.

*Smacks forehead.*

Here's what I came up with.

#!/usr/bin/env python

import random

class pool(list):
    
    def rand_index(self):
        length = len(self)
        return random.randint(0, length - 1)

    def unique_rand(self):
        return self.pop(self.rand_index())

def main():
    print("""Prints each element in a sequence at random once and only once.""")
    my_pool = pool("AaBbCcDd")
    print(my_pool.unique_rand())
    print(my_pool.unique_rand())
    print(my_pool.unique_rand())
    print(my_pool.unique_rand())
    print(my_pool.unique_rand())
    print(my_pool.unique_rand())
    print(my_pool.unique_rand())
    print(my_pool.unique_rand())
    try:
        print(my_pool.unique_rand())
    except:
        print("The ninth attempt failed because it attempted to remove a ninth element from a sequence of eight elements that had eight previous removals.")
    
    
if __name__ == "__main__":
    main()

Actually, dictionaries have a popitem method that returns an arbitrary key and value.

I don't know how much a difference there is to using my method or a dictionary though.

Actually, dictionaries have a popitem method that returns an arbitrary key and value.

I don't know how much a difference there is to using my method or a dictionary though.

Popitem is not meant to be a random generator.

Popitem is not meant to be a random generator.

What does it do then? How is it used?

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.