Why is processing a sorted array faster than an unsorted array?

Question

iConqueror 27 Junior Poster

10 Years Ago

Here is a piece of C++ code that seems very peculiar. For some strange reason, sorting the data miraculously makes the code almost six times faster:

#include <algorithm>
#include <ctime>
#include <iostream>

int main()
{
    // Generate data
    const unsigned arraySize = 32768;
    int data[arraySize];

    for (unsigned c = 0; c < arraySize; ++c)
        data[c] = std::rand() % 256;

    // !!! With this, the next loop runs faster
    std::sort(data, data + arraySize);

    // Test
    clock_t start = clock();
    long long sum = 0;

    for (unsigned i = 0; i < 100000; ++i)
    {
        // Primary loop
        for (unsigned c = 0; c < arraySize; ++c)
        {
            if (data[c] >= 128)
                sum += data[c];
        }
    }

    double elapsedTime = static_cast<double>(clock() - start) / CLOCKS_PER_SEC;

    std::cout << elapsedTime << std::endl;
    std::cout << "sum = " << sum << std::endl;
}

Without std::sort(data, data + arraySize);, the code runs in 11.54 seconds.
With the sorted data, the code runs in 1.93 seconds.

Initially, I thought this might be just a language or compiler anomaly. So I tried it in Java:

import java.util.Arrays;
import java.util.Random;

public class Main
{
    public static void main(String[] args)
    {
        // Generate data
        int arraySize = 32768;
        int data[] = new int[arraySize];

        Random rnd = new Random(0);
        for (int c = 0; c < arraySize; ++c)
            data[c] = rnd.nextInt() % 256;

        // !!! With this, the next loop runs faster
        Arrays.sort(data);

        // Test
        long start = System.nanoTime();
        long sum = 0;

        for (int i = 0; i < 100000; ++i)
        {
            // Primary loop
            for (int c = 0; c < arraySize; ++c)
            {
                if (data[c] >= 128)
                    sum += data[c];
            }
        }

        System.out.println((System.nanoTime() - start) / 1000000000.0);
        System.out.println("sum = " + sum);
    }
}

With a somewhat similar, but less extreme result.

My first thought was that sorting brings the data into the cache, but my next thought was how silly that is, because the array was just generated.

What is going on?
Why is a sorted array faster than an unsorted array?
The code is summing up some independent terms, and the order should not matter.

algorithm java

4 Contributors
3 Replies
371 Views
4 Hours Discussion Span
Latest Post 10 Years Ago Latest Post by ~s.o.s~

helloWorld22 commented: Good question mate! +0

All 3 Replies

JamesCherrill 4,733 Most Valuable Poster

10 Years Ago

I can't see anything in the bytecode that would explain a speed difference, and caching at the RAM level isn't going to be relevant, so maybe this is a low-level hardware thing - pipelining and branch prediction at the CPU level maybe???

If you refactor that into a runnable program that produces a simple output automatically then I'm sure a few people here will run it for you and you can consolidate the results.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

sepp2k 378 Practically a Master Poster · Answer 1 · 2014-07-01T08:44:36+00:00

pipelining and branch prediction at the CPU level maybe???

Branch prediction seems like a pretty good candidate. Note that the if condition data[c] >= 128 will randomly switch between being true and false when the array is unsorted, but when it is sorted it will be false consistently until the first element > 128 appears and after that it will be true all the time. So it will act much more predictably for a sorted array.

~s.o.s~ 2,560 Failure as a human Team Colleague Featured Poster · Answer 2 · 2014-07-01T10:46:40+00:00

~s.o.s~ 2,560 Failure as a human

10 Years Ago

It's branch prediction indeed; read this.

Why is processing a sorted array faster than an unsorted array?

Recommended Answers Collapse Answers

All 3 Replies

Recommended Answers