Trying to play with multithreading by taking some of my existing apps and converting them over. In doing so, I noticed a strange phenomenon that I've been able to reproduce in a very simple program.

let's say I create 2 threads. Each thread simply does a for loop from 1 to 1 billion and returns. The program on my (multi cpu) machine takes 2 - 3 seconds. However, if I put a single printf or cout in the thread's loop (let's say I check in the for loop if I'm at 500 million and if so execute the printf), then the same program takes over 25 seconds. This happens regardless of whether I put a mutex or not around the printf.

has anyone seen anything like this?


try the same thing in a single threded program and you will probably get the same or similar results.