I was assigned to study and understand some code and I was told the major problem with the program was poor performance.
Here's the bottleneck:
A global data set is maintained. This data set is updated regularly. Then there are threads that have to do some processing using a snapshot of the dataset after each update. The processing threads do not change the dataset but they require that no one else changes the dataset while they are evaluating as well. This means that the dataset cannot be updated (Which in turn means that the processing threads can't process using the updated dataset) while the threads are processing - which is the bottleneck as the update rate for the dataset is high.
I can think of couple of ways of improving the performance, but I feel pretty certain that this is a common problem. I would like to know the name of this problem if this is indeed a common problem. (So I can Google). Or suggestions on how to improve performance.
Currently I thought of two methods:
1. Maintain 2 or more data sets that are identical. Update one data set(A) when the other(B) is used by processing threads. then update B using A with a dedicated thread?
2. Incremental updating of data set? Keep a separate list of updates and the dataset. When updating, add to the list. When processing, use both the list's data and dataset. periodically update dataset with the list. (This is again like the first method)