The code shows a fast prime number generator using a sieve algorithm. It looks at only odd numbers. Technically 1 and 2 are prime numbers too, since they are only divisible by unity and themselves, exceptions are made for 1 and 2. Zero and negative numbers return an empty list.

Note:
The official definition for a prime number is any natural number [B]greater than 1[/B] that has the two divisors 1 and itself. So, we are leaving the 1 out!

This is not a generator object in the Python sense. It is a function that generates a list of prime numbers.

Note: see modified/corrected code at:
[url]http://www.daniweb.com/forums/post1254825.html#post1254825[/url]

For much improved speed see ...
https://www.daniweb.com/software-development/python/code/434136/fast-prime-list-functions

Edited 2 Years Ago by vegaseat

# fast prime number list generator using a sieve algorithm

def primes(n):
  """ returns a list of prime numbers from 2 to < n """
  if n < 2:  return []
  if n == 2: return [2]
  # do only odd numbers starting at 3
  s = range(3, n, 2)
  # n**0.5 may be slightly faster than math.sqrt(n)
  mroot = n ** 0.5
  half = len(s)
  i = 0
  m = 3
  while m <= mroot:
    if s[i]:
      j = (m * m - 3)//2
      s[j] = 0
      while j < half:
        s[j] = 0
        j += m
    i = i + 1
    m = 2 * i + 3
  # make exception for 2
  return [2]+[x for x in s if x]

print '-' * 50  # print 50 dashes, cosmetic
num = 100
primeList = primes(num)
print "List of prime numbers from 2 to < %d:" % num
print primeList

I get an error when trying to run this on my Ubuntu Linux 6.10 Machine, as follows:

./getprimes.py: line 3: syntax error near unexpected token `('
./getprimes.py: line 3: `def primes(n):'

I clicked 'Toggle Plain Text' before highlighting the code and pasting it into my editor (IDLE), then it runs just great.

Hi vegaseat,

since i'm just learning python, I found your code quite inspiring. However I couldn't find out how its supposed to work: apparently you generate a list of possible numbers s and mark the ones which are prime?

Beside that its not working properly:
primes(0, 1, 2) are ok, but arguments 3,4,5,6,7 give no output :-(. 9 even gives an error:

>>> primes.primes(9)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "primes.py", line 33, in primes
    s[j] = 0            # setze s[3] = 0
IndexError: list assignment index out of range
>>> primes.primes(9)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "primes.py", line 33, in primes
    s[j] = 0            # setze s[3] = 0
IndexError: list assignment index out of range

I understand, that the function is supposed to give prime up to and including n, but this is not the case for n>2:

>>> primes.primes(17)
[2, 3, 5, 7, 11, 13, 15]

it should give ..., 11, 13, 17] .
Also all multiples of 5 are counted as primes.
So how to repair? any hint?

Edited 6 Years Ago by thomec: n/a

Hi vegaseat,

since i'm just learning python, I found your code quite inspiring. However I couldn't find out how its supposed to work: apparently you generate a list of possible numbers s and mark the ones which are prime?

Beside that its not working properly:
primes(0, 1, 2) are ok, but arguments 3,4,5,6,7 give no output :-(. 9 even gives an error:

>>> primes.primes(9)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "primes.py", line 33, in primes
    s[j] = 0            # setze s[3] = 0
IndexError: list assignment index out of range
>>> primes.primes(9)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "primes.py", line 33, in primes
    s[j] = 0            # setze s[3] = 0
IndexError: list assignment index out of range

I understand, that the function is supposed to give prime up to and including n, but this is not the case for n>2:

>>> primes.primes(17)
[2, 3, 5, 7, 11, 13, 15]

it should give ..., 11, 13, 17] .
Also all multiples of 5 are counted as primes.
So how to repair? any hint?

Thanks for pointing out the flaw. I worked on it for a while and also modified it to work with Python3. Here is the new working code, quite a bit faster too ...

# a fast prime number list 'generator' using a sieve algorithm

def primes(n):
    """
    returns a list of prime numbers from 2 to n
    """
    if n < 2:  return []
    if n == 2: return [2]
    # create a list of odd numbers from 3 to n
    nums = list(range(3, n+1, 2))
    nums_len = (n // 2) - 1 + (n % 2)
    idx = 0
    idx_sqrtn = (int(n**0.5) - 3) // 2
    while idx <= idx_sqrtn:
        nums_idx = (idx << 1) + 3
        for j in range(idx*(nums_idx+3)+3, nums_len, nums_idx):
            # if not a prime replace with zero
            nums[j] = 0
        idx += 1
        while idx <= idx_sqrtn:
            if nums[idx] != 0:
                break
            idx += 1
    # remove all the zero entries
    return [2] + [x for x in nums if x != 0]


print('-' * 50)  # print 50 dashes, cosmetic
num = 100
primeList = primes(num)
print("List of prime numbers from 2 to %d:" % num)
print(primeList)

Funny, I just started playing with Python and primes was the first thing I tried. Here's what I came up with - appreciate any feedback, especially the useful kind.

What I was really wanting to learn, of course, was file io, hence the trick of storing a batch of primes in a file and retrieving them to check the next batch. It's also handy because if you have the primes up to n, you don't take all day regenerating them to get past n. But it's also a working sieve, I think.

#!/usr/bin/python2.6
import time
import  pickle
def build_primes_list(min, n, primes):
	i=min
	while i<n:
		j=i/2
		prime = 1   # flag, could have used a boolean value
 		for x in primes: 
			if x>j: break		
			if i%x ==0:
			 	prime = 0
		if prime ==1:
			primes+=[i]
		i+=1
	return primes	

infile = file('primes_list2', 'r')  #comment out these lines first time you run it
primes = pickle.load(infile)        #
min =primes[0]                      #
limit = min+1000                    #
primes = primes[1:]                 #
#min = 10  		#uncomment these lines first time you run this
#limit = 1000
#primes = [2,3,5,7]

starttime = time.clock()
p = build_primes_list(min, limit, primes)
stoptime = time.clock()
elapsed = stoptime-starttime
print "Took %.3f seconds to generate primes from %d to %d"%(elapsed, min, limit)
primes.insert(0,limit)
outfile = file('primes_list2', 'w')
pickle.dump(p, outfile)

Thanks for pointing out the flaw. I worked on it for a while and also modified it to work with Python3. Here is the new working code, quite a bit faster too ...

This new version does not seem as fast as this part of example codes from shedskin code examples (version 0.3):

def sieveOfEratostenes(n):
    """sieveOfEratostenes(n): return the list of the primes < n."""
    # Code from: <dickinsm@gmail.com>, Nov 30 2006
    # http://groups.google.com/group/comp.lang.python/msg/f1f10ced88c68c2d
    if n <= 2:
        return []
    sieve = range(3, n, 2)
    top = len(sieve)
    for si in sieve:
        if si:
            bottom = (si*si - 3) // 2
            if bottom >= top:
                break
            sieve[bottom::si] = [0] * -((bottom - top) // si)
    return [2] + [el for el in sieve if el]

The other part of the code is still faster, the sieve of Atkin, but complicated:

def sieveOfAtkin(end):
    """sieveOfAtkin(end): return a list of all the prime numbers <end
    using the Sieve of Atkin."""
    # Code by Steve Krenzel, <Sgk284@gmail.com>, improved
    # Code: http://krenzel.info/?p=83
    # Info: http://en.wikipedia.org/wiki/Sieve_of_Atkin
    assert end > 0, "end must be >0"
    lng = ((end // 2) - 1 + end % 2)
    sieve = [False] * (lng + 1)

    x_max, x2, xd = int(sqrt((end-1)/4.0)), 0, 4
    for xd in xrange(4, 8*x_max + 2, 8):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not (n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            m = n % 12
            if m == 1 or m == 5:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, x2, xd = int(sqrt((end-1) / 3.0)), 0, 3
    for xd in xrange(3, 6 * x_max + 2, 6):
        x2 += xd
        y_max = int(sqrt(end-x2))
        n, n_diff = x2 + y_max*y_max, (y_max << 1) - 1
        if not(n & 1):
            n -= n_diff
            n_diff -= 2
        for d in xrange((n_diff - 1) << 1, -1, -8):
            if n % 12 == 7:
                m = n >> 1
                sieve[m] = not sieve[m]
            n -= d

    x_max, y_min, x2, xd = int((2 + sqrt(4-8*(1-end)))/4), -1, 0, 3
    for x in xrange(1, x_max + 1):
        x2 += xd
        xd += 6
        if x2 >= end: y_min = (((int(ceil(sqrt(x2 - end))) - 1) << 1) - 2) << 1
        n, n_diff = ((x*x + x) << 1) - 1, (((x-1) << 1) - 2) << 1
        for d in xrange(n_diff, y_min, -8):
            if n % 12 == 11:
                m = n >> 1
                sieve[m] = not sieve[m]
            n += d

    primes = [2, 3]
    if end <= 3:
        return primes[:max(0,end-2)]

    for n in xrange(5 >> 1, (int(sqrt(end))+1) >> 1):
        if sieve[n]:
            primes.append((n << 1) + 1)
            aux = (n << 1) + 1
            aux *= aux
            for k in xrange(aux, end, 2 * aux):
                sieve[k >> 1] = False

    s  = int(sqrt(end)) + 1
    if s  % 2 == 0:
        s += 1
    primes.extend([i for i in xrange(s, end, 2) if sieve[i >> 1]])

    return primes
no psyco n: 1000000
Sieve of Atkin
78498 147.993390221 ms
Sieve of Eratostenes
78498 378.386079795 ms
vegaseat sieve
78498 941.241694126 ms
Ready

With psyco module:

psyco
n: 1000000
Sieve of Atkin
78498 98.5812442643 ms
Sieve of Eratostenes
78498 218.693996025 ms
vegaseat sieve
78498 409.932547293 ms
Ready

Shedskin:

D:\Tony\shedskin-examples-0.3>shedskin sieve.py
*** SHED SKIN Python-to-C++ Compiler 0.3 ***
Copyright 2005-2009 Mark Dufour; License GNU GPL version 3 (See LICENSE)

[iterative type analysis..]
**
iterations: 2 templates: 244
[generating c++ code..]

D:\Tony\shedskin-examples-0.3>make
g++ -O2 -pipe -Wno-deprecated  -I. -ID:/Tony/shedskin-0.3/shedskin/shedskin/lib D:/Tony/
shedskin-0.3/shedskin/shedskin/lib/sys.cpp D:/Tony/shedskin-0.3/shedskin/shedskin/lib/bu
iltin.cpp sieve.cpp D:/Tony/shedskin-0.3/shedskin/shedskin/lib/math.cpp D:/Tony/shedskin
-0.3/shedskin/shedskin/lib/time.cpp D:/Tony/shedskin-0.3/shedskin/shedskin/lib/re.cpp -l
gc -lpcre  -o sieve

D:\Tony\shedskin-examples-0.3>sieve
n: 1000000
Sieve of Atkin
78498 63.0 ms
Sieve of Eratostenes
78498 78.0 ms
vegaseat sieve
78498 78.0 ms

Edited 6 Years Ago by pyTony: n/a

in python 3 using comprehension sets

sorted(set(range(2,n+1))-{notAPrime for i in range(2, int(n**0.5)) for notAPrime in range(i * 2, n+1, i)})

in python 3 using comprehension sets

n=10000
sorted(set(range(2,n+1))-{notPrime for i in range(2, int(n**0.5)) for notPrime in range(i * 2, n+1, i)})
Comments
Good and efficient code, but explanations for the beginner OP would be helpful

in python 3 using comprehension sets

n=10000
sorted(set(range(2,n+1))-{notPrime for i in range(2, int(n**0.5)) for notPrime in range(i * 2, n+1, i)})

When I added reputation to that, I said "beginner OP", I didn't mean the OP, I meant jon.kiparsky. Sorry.

Here's a un-onelined translation of what's going on, to help.

#PYTHON 3 CODE
a = set(range(2,n+1))
b = set()
for i in range(2,int(n**.5)):
    for notPrime in range(i*2, n+1, i):
        b.add(notPrime)

sorted(a-b)

Unfortunately timing is not for this implementation (which does not optimize away the even numbers, for example):

n: 1000000
Sieve of Atkin
78498 784.414499608 ms
vegaseat sieve
78498 408.931302721 ms
jrbustosm set comprehension
78498 3176.0572922 ms
Ready

Interestingly vegaseat's code seem to win in this case over sieve of Atkin. Quite different from performance compared to 2.6.5 without psyco:

n: 1000000
Sieve of Atkin
78498 579.703006946 ms
Sieve of Eratostenes
78498 254.331486264 ms
vegaseat sieve
78498 551.307219214 ms
Ready

Here's a Euler's sieve implementation:
Python 2.7

n = int(raw_input("Num: "))

s = range(0, n+1)

i = 1
l = 2

while l < len(s):
	j = l + (s[l]*i)
	while (j < len(s)) and (s[l] != 0) :
		s[j] = 0
		i += 1
		j = l + (s[l]*i)
	i = 1
	l += 1

s = list(set(s))
s.sort()
s.remove(0)
s.remove(1)
print s

It runs rather well for large numbers as well.

Edited 5 Years Ago by Thisisnotanid: n/a

Here's a Euler's sieve implementation:
Python 2.7

n = int(raw_input("Num: "))

s = range(0, n+1)

i = 1
l = 2

while l < len(s):
	j = l + (s[l]*i)
	while (j < len(s)) and (s[l] != 0) :
		s[j] = 0
		i += 1
		j = l + (s[l]*i)
	i = 1
	l += 1

s = list(set(s))
s.sort()
s.remove(0)
s.remove(1)
print s

It runs rather well for large numbers as well.

On first blush I would say that this algorithm will be slow since it uses the function len() so many times. Functions calls have a lot of overhead and slow things down. For instance Python module profile shows that n=1000000 makes 4775209 calls to function len()

Edited 5 Years Ago by vegaseat: profile

On first blush I would say that this algorithm will be slow since it uses the function len() so many times. Functions calls have a lot of overhead and slow things down. For instance Python module profile shows that n=1000000 makes 4775209 calls to function len()

I'm not exactly any expert at these things; I'm relatively new to programming and actually just finished my first course in Python (Which happens to be the first language I've learned so that I can start writing code my own.) Can you explain a little more, or point me to the right direction? Also, what would be a good workaround in your opinion? Thanks.

On first blush I would say that this algorithm will be slow since it uses the function len() so many times. Functions calls have a lot of overhead and slow things down. For instance Python module profile shows that n=1000000 makes 4775209 calls to function len()

I looked at my code again and, taking a bit from what you did, came up with the following. It works significantly faster for n = 1000000.

n = int(raw_input("Num: "))

s = range(3, n+1, 2)

i = 1
l = 0

a = len(s) - 1

while l < a:
	if s[l] != 0:
		j = (l + (s[l]*i))
		while (j <= a):
			s[j] = 0
			i += 1
			j = (l + (s[l]*i))
	else:
		if l < a:
			while s[l] == 0 and l < a:
				l += 1
		if l == a:
			break
	i = 1
	l += 1

s = list(set(s))
s.sort()
s.remove(0)
s = [2] + s
print s

What do you think?

Yes, a major improvement!

A word of coding advice, using the single letter 'l' as a variable name is rather poor, since it looks so much like the number '1'. I don't even like 'i' or 'j'.

The Python profiler is your friend. To run a profile you have to put all the time critical code into one function. Check out these two codes and thou shall see ...

import profile

def primenumbers1(n):
    s = range(0, n+1)

    x = 1
    y = 2

    while y < len(s):
        z = y + (s[y]*x)
        while (z < len(s)) and (s[y] != 0) :
            s[z] = 0
            x += 1
            z = y + (s[y]*x)
        x = 1
        y += 1
    s = list(set(s))
    s.sort()
    s.remove(0)
    s.remove(1)        
    return s

#n = int(raw_input("Num: "))
n = 1000000

profile.run('primenumbers1(n)')


# test
s = primenumbers1(n)
# show the first 10 primes
print(s[:10])
# show the last 10 primes
print(s[-10:])

"""result Python26 -->
         4775217 function calls in 11.296 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  4775209    4.504    0.000    4.504    0.000 :0(len)
        1    0.025    0.025    0.025    0.025 :0(range)
        2    0.000    0.000    0.000    0.000 :0(remove)
        1    0.001    0.001    0.001    0.001 :0(setprofile)
        1    0.046    0.046    0.046    0.046 :0(sort)
        1    0.004    0.004   11.294   11.294 <string>:1(<module>)
        1    6.715    6.715   11.291   11.291 prime_profile1.py:3(primenumbers1)
        1    0.000    0.000   11.296   11.296 profile:0(primenumbers1(n))
        0    0.000             0.000          profile:0(profiler)


[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
[999863, 999883, 999907, 999917, 999931, 999953, 999959, 999961, 999979, 999983]
"""

Also notice that one prime is missing at the end.

import profile

def primenumbers2(n):
    s = range(3, n+1, 2)

    x = 1
    y = 0

    a = len(s) - 1
    while y < a:
        if s[y] != 0:
            z = (y + (s[y]*x))
            while (z <= a):
                s[z] = 0
                x += 1
                z = (y + (s[y]*x))
        else:
            if y < a:
                while s[y] == 0 and y < a:
                    y += 1
            if y == a:
                break
        x = 1
        y += 1
    s = list(set(s))
    s.sort()
    s.remove(0)
    s = [2] + s        
    return s

#n = int(raw_input("Num: "))
n = 1000000

profile.run('primenumbers2(n)')


# test
s = primenumbers2(n)
# show the first 10 primes
print(s[:10])
# show the last 10 primes
print(s[-10:])

"""result Python26 -->
         8 function calls in 0.361 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 :0(len)
        1    0.009    0.009    0.009    0.009 :0(range)
        1    0.000    0.000    0.000    0.000 :0(remove)
        1    0.001    0.001    0.001    0.001 :0(setprofile)
        1    0.063    0.063    0.063    0.063 :0(sort)
        1    0.005    0.005    0.359    0.359 <string>:1(<module>)
        1    0.282    0.282    0.355    0.355 prime_profile2.py:3(primenumbers2)
        1    0.000    0.000    0.361    0.361 profile:0(primenumbers2(n))
        0    0.000             0.000          profile:0(profiler)


[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
[999931, 999937, 999941, 999953, 999959, 999961, 999977, 999979, 999983, 999991]
"""
Comments
Very helpful

Thanks! I didn't know about the profiler :) Now I can time my programs. Also, I tried to make my program more general by introducing the ability to select a range to check for primes, instead of checking from 3 on up. Looking at the logic, it seems the algorithm should work well, but it's not able to detect if 3 is in the range. And it just behaves weirdly. On one iteration, it even took out some primes higher than the lower bound! Although, I've fixed the problem and am using an entirely different (and more time efficient) approach now, I'm still curious as to why that didn't work as it was supposed to. Maybe a newbie error? What do you think:

'''Euler's Sieve implementation.'''

n = int(raw_input("Num: "))

lwb = int(raw_input("Lower bound: "))

if lwb < 2:
	lwb = 2

s = range(3, n+1, 2)

x = 1 
y = 0 

a = len(s) - 1

while y < a: 
	if s[y] != 0: 
		z = (y + (s[y]*x)) 
		while (z <= a): 
			s[z] = 0 
			x += 1 
			z = (y + (s[y]*x))
	else: 
		if y < a: 
			while s[y] == 0 and y < a: 
				y += 1
		if y == a:
			break
	x = 1
	y += 1

s = list(set(s))
s.sort()
s.remove(0)
s = [2] + s

for m in s:
	if m < lwb:
		s.remove(m)

print s

Edited 5 Years Ago by Thisisnotanid: Removed unneeded comments.

Line 29 I think needs to go between line 21 and 22.
can you check if that helps?
I say that because y += 1 is already handled in second if loop.
so it doesn't make sense to increment it again at line 29

Comments
Thanks for wanting to help, but you should first look the date of the post!
The article starter has earned a lot of community kudos, and such articles offer a bounty for quality replies.