I wrote an application using pthreads which had 5 threads. 4 of the threads filled up 4 buffers and once they had been filled the main thread would be combine these together. This was achieve using barrier so the main thread would wait until the 4 worker threads had reached the barrier.

I am looking at re-writing this but using c++ 11 threads and see there isn't a barrier as such available and wanted to know what would be the best/safest way to achieve the same result as the pthread based program.

Thanks in advance.

Recommended Answers

All 10 Replies

Sorry I didnt specify in the first post that the threads are running indefinitely.

You might be able to use a std::mutex for this. You might also want to rethink your data structure. You could have all of the thread writing to a thread safe queue and the reader could be processing it at the same time. This might make your execution faster.

I'm not sure about having a single queue in there as the four threads receive their data over IP and they can all come in at different times but the processing has to occur when all four have reached enough data for it to be mixed together.

I have got it sort of working using condition variables where the four and each one is notified once the data has been processed. Is that way a bad idea?

There was a proposal to add barrier / latch to the C++ threading library (see n3817), but it came in too late for the C++14 standard release. I don't know it's current status, but I would imagine it will make it into the C++17 standard. But, obviously, that doesn't help you much.

To make things worse, the reader/writer locks (aka std::shared_timed_mutex and std::shared_mutex) didn't make it in until C++14, which would have made for a nice barrier implementation.

For a pure C++11 implementation, you could just use the one from Boost, which has had a barrier class for a long time (in its Boost.Thread library, which is the basis for the standard thread library). The implementation is pretty straight-forward:

class barrier
{

public:
  barrier(const barrier&) = delete;
  barrier& operator=(const barrier&) = delete;

  explicit barrier(unsigned int count) :
    m_count(check_counter(count)), m_generation(0), 
    m_count_reset_value(count)
  {
  }

  void count_down_and_wait()
  {
    std::unique_lock< std::mutex > lock(m_mutex);
    unsigned int gen = m_generation;

    if (--m_count == 0)
    {
      m_generation++;
      m_count = m_count_reset_value;
      m_cond.notify_all();
      return;
    }

    while (gen == m_generation)
      m_cond.wait(lock);
  }

private:
  std::mutex m_mutex;
  std::condition_variable m_cond;
  unsigned int m_count;
  unsigned int m_generation;
  unsigned int m_count_reset_value;
};

The actual boost implementation is a bit more complicated, but the above is what matters.

But, in my experience, condition-variables are pretty slow. So, if that becomes a problem for you too, you could do as I did, which is to implement a barrier using atomics instead:

class spinlock_barrier
{
public:
  spinlock_barrier(const spinlock_barrier&) = delete;
  spinlock_barrier& operator=(const spinlock_barrier&) = delete;

  explicit spinlock_barrier(unsigned int count) :
    m_count(check_counter(count)), m_generation(0), 
    m_count_reset_value(count)
  {
  }

  void count_down_and_wait()
  {
    unsigned int gen = m_generation.load();

    if (--m_count == 0)
    {
      if (m_generation.compare_exchange_weak(gen, gen + 1))
      {
        m_count = m_count_reset_value;
      }
      return;
    }

    while ((gen == m_generation) && (m_count != 0))
      std::this_thread::yield();
  }

private:
  std::atomic<unsigned int> m_count;
  std::atomic<unsigned int> m_generation;
  unsigned int m_count_reset_value;
};

Which has the disadvantage that the threads that are waiting will not be put to sleep, but instead, will be yielding their time. This is a so-called "spin-lock" or "busy-wait". But it has the advantage of being much faster than the mutex / condition-variable.

A simple implementation of CyclicBarrier in C++ is accessible here .
Also, implementation of CountDownLatch in C++ is accessible here

Both the implementation works only for >= C++11. But we may easily change them to work with previous C++ standards by directly using the posix threads (or any underlying thread libraries) from the classes.

here is my simplest barrier. each thread has a sync. point of 2 lines

while ( !bWakeTH ) {bRequest_n= true;}
while ( bWakeTH  ) {bRequest_n=false;}

// bRequest_n is global bool visible by thread n & its boss written only by worker
// bWakeTH is global volatile atomic<bool> visible by all threads and written just by the boss
the boss has one line of code

while ( bRequest_0 || bRequest_1 || bRequest_2 || ... ) {bWakeTH = true; } 

once all threads set their bRequest the control bWakeTH is true, causing threads to change
their request to false.
this solution is inspired by hardware, were a flipflop is constructed by two latches having non overlap clock.

i do not find a way to rectify my post typo error

boss line should be:

while ( bRequest_0 && bRequest_1 && bRequest_2 && ... ) {bWakeTH = true; }

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.