Hey, guys.

Find the sum of all the primes below 2000000.

My code:

main.cpp

``````#include <iostream>
#include <vector>
#include <cmath>

#include "Sieve.h"

using namespace std;

typedef unsigned long long int ulli;

ulli Sum = 0;

ulli Sum_of_Primes(vector<bool>, ulli&);

ulli Sum_of_Primes(vector<bool> Primes, ulli& Sum)
{
for(ulli a = 2; a < Upper_Limit; a++)
{
if (Primes[a])
{
Sum += a;
}

}
return Sum;
}

int main()
{
Find_Primes_until_n(Primes);
cout << Sum_of_Primes(Primes,Sum) << endl;
return 0;
}
``````

Sieve.h

``````#ifndef SIEVE_H_INCLUDED
#define SIEVE_H_INCLUDED

using namespace std;

#define Upper_Limit 2000000

vector<bool> Primes;

void Find_Primes_until_n (vector<bool>&);

void Find_Primes_until_n(vector<bool>& Primes)
{
Primes.resize(Upper_Limit);
fill(Primes.begin(),Primes.begin() + Upper_Limit , true);
for (int i = 2; i < Upper_Limit; i++)
{
if(Primes[i])
{
for (int j = 2*i; j <Upper_Limit; j+= i)
{
Primes[j] = false;
}
}
}
}

#endif // SIEVE_H_INCLUDED
``````

It takes about 0.3s. How can I improve it?

Thank you.

Petcheco

Building the seive than getting the sum is taking a lot of extra time. You can build the sum when you're building the seive.

Since this is for a specific purpose and problem, you cheat a bit by using a bitset. This requires a constant value to initialize, but it …

## All 4 Replies

Building the seive than getting the sum is taking a lot of extra time. You can build the sum when you're building the seive.

Since this is for a specific purpose and problem, you cheat a bit by using a bitset. This requires a constant value to initialize, but it sets all bits to false, without needing an extra loop to do it. This does require reversing the logic. False is prime.

The outer loop only needs to run until it reaches the square root of the upper limit. Then to finish building the sum you only need to loop from there to the upper limit.

Since your purpose is to get a sum you don't need to completely build the seive. Start the sum at 2 then only loop through the odd numbers.

The inner loop can ignore all the multiples of I upto I squared. Also since I is odd the square will be odd so the step can step 2 times I and only set the odd multiples.

Put all this together and you can increase the speed by a factor of about 4:

``````#include <bitset>

typedef unsigned long long ull;
const size_t PRIME_LIMIT = 2000000;

ull Find_Primes_Sum()
{
bitset<PRIME_LIMIT> Primes(3);
ull retval = 2;
int limit = sqrt(PRIME_LIMIT);
int i = 3;
for (; i < limit; i += 2)
{
if (!Primes[i])
{
retval += i;
int step = i * 2;
for (int j = i * i; j <PRIME_LIMIT; j += step)
{
Primes[j] = true;
}
}
}
for (; i < PRIME_LIMIT; i += 2)
{
if (!Primes[i])
{
retval += i;
}
}
return retval;
}
``````
commented: Nice! +15

Hey, mate.

Could you clear up a couple of things for me?

I initialized my bitset variable by doing `Primes.set();`, thus making all the elements of it true. Why shouldn't I do it this way?

Why is using `fill` with a vector slower than using a bitset?

I didn't understand why we only have to loop until the sqrt(PRIME_LIMIT) the first time and why the conditions for your second loop were `(int j = i * i; j <PRIME_LIMIT; j += step`.

Thank you very much.

Petcheco

I initialized my bitset variable by doing Primes.set();, thus making all the elements of it true. Why shouldn't I do it this way?

It's not necessarily wrong, it is extra time that doesn't gain anything. The logic isn't any faster.

Why is using fill with a vector slower than using a bitset?

Why is using fill with a vector slower than using a bitset?

vector<bool> is optimized for memory usage not speed. Also `fill` is basically just a loop that iterates through every element and sets them to the same value. This takes more time.

I didn't understand why we only have to loop until the sqrt(PRIME_LIMIT) the first time and why the conditions for your second loop were (int j = i * i; j <PRIME_LIMIT; j += step.

The first loop finds the primes and the second loop sets the multiples of the primes. Any prime bigger than the square root of the target number, it's multiples will be bigger than the target number, so it's not necessary to set their multiples. Any multiple of a prime that is less than the prime squared is already set so it makes sense to start the inner loop at `i * i`. This start is also odd, since any odd number times an odd number is odd. When we step from this value we can eliminate some iterations by only setting the odd multiples, thus `(i * i) + (2 * i)` will by an odd multiple of i.

Thank you for your explanation and patience, mate.

I understood everything.