Hi,
I was assigned to study and understand some code and I was told the major problem with the program was poor performance.
Here's the bottleneck:
A global data set is maintained. This data set is updated regularly. Then there are threads that have to do some processing using a snapshot of the dataset after each update. The processing threads do not change the dataset but they require that no one else changes the dataset while they are evaluating as well. This means that the dataset cannot be updated (Which in turn means that the processing threads can't process using the updated dataset) while the threads are processing - which is the bottleneck as the update rate for the dataset is high.
I can think of couple of ways of improving the performance, but I feel pretty certain that this is a common problem. I would like to know the name of this problem if this is indeed a common problem. (So I can Google). Or suggestions on how to improve performance.

Currently I thought of two methods:
1. Maintain 2 or more data sets that are identical. Update one data set(A) when the other(B) is used by processing threads. then update B using A with a dedicated thread?

2. Incremental updating of data set? Keep a separate list of updates and the dataset. When updating, add to the list. When processing, use both the list's data and dataset. periodically update dataset with the list. (This is again like the first method)

Thanks

Both sound reasonable approaches.

Hey,

I think what your looking for is called a Mutex, or Mutual Exclusion.
Simple example: If something is TRUE, its obviously NOT FALSE. So FALSE is mutually excluded if something is TRUE. In the same way there are ways to Mutex two (or more) processes.

Hope this is any good to you! -Harry

Nope. Not looking for mutexes :D. What I want is to be able to write to a data set while many threads are reading from it. the reading threads require that the data set remain unchanged during their processing. This is obviously made simple by using mutexes- unfortunately I can't use it because then the performance would suck.

i.e. Update rate is 100 updates/s, 25 threads, 5 processings per sec per thread.
What happens if I lock the data set is that the effective processing rate of the system drops to 5.

Anyone know a name for this mess?

Well the first one is basically called double buffering.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.