Hey all,

I was hoping I could get some advice on designing the proper threading for our project.

A little background info - our company does simulators of complex electrical/mechanical systems. These simulators could have fuel systems, electrical systems, pneumatic pressure systems, etc. I suppose you can just think of a car and that's close enough. We have done many of these simulators and have no problems with the systems themselves. However, we get requests from customers that have drastically different parts. Think of simulating one model of a car, including the ability to fail every individual component and system, then having to do a completely different model. For far too many years, we've rewritten our code base and had very poor code re-usability.

Currently, we are trying to make a centralized code base that can be used as our "core". We want to be able to switch from one model to another with minimal fuss and greatly improve our outdated design. Our method to this madness was to represent all objects as individual componenets derived from a similar base class. These components use an event driven model to pass data between each other dynamically, allowing a great number of systems to use this without actually having to know what the component is. We also have other base classes for appropriate systems, such as "IElectricalComponent". This mostly follows the Template design methodology:

With this, we should be able to build a new model by simply defining the specific parts of this model to the abstract factory, link up all the data messages, and let the systems run themselves.

The problem we're trying to overcome, however, is the sheer number of possible inputs. We get some of our data from a PLC using real switches and buttons to simulate the real thing, some of our data comes in from individual touch screens, some of it comes in from serial data streams from the real components, and some of it comes in from a control loader. We cannot guarantee until we get the contract how each simulator will behave. Because of this, we have very dynamic connections (and try our hardest to use UDP if possible). Our current model that we're replacing used a single thread for each connection, and did a lot of recreating code to define what that connection did. Because of our new model, we were trying to make this more scripted and dynamic.

The data mapping is not the problem, but the threading is. Since I cannot guarantee what is in each connection, I am uncertain where to put my semaphores/mutexes. Should each system (such as electrical) have it's own mutex, and each connection may have to lock several depending upon it's data? Should each connection have only one semaphore, which would require a giant global mutex lock on all systems while data is being set? Is a mutex generally fast enough that each component should have it's own data setting mutex and have hundreds of them possibly being set on a large chunk of data?

Any advice on how to thread this would be appreciated.

This is what is called a signals and systems implementation (at least, that's how I call it..).

I had a similar problem a while back, when writing a piece of control software for a humanoid robot. This is actually not trivial to do correctly, as I'm sure you have figured out by now. I managed to get a decent implementation working, but it wasn't perfect (it would stall on rare occasions). My advice is to use a library for that. You didn't specify the language so I can't point you to one in particular (if you are using C++, then look at Boost.Asio or the new C++11 standard library for concurrency, especially futures and async).

I'm not an expert with this, but I can tell you how I did it. I had a class template for signals (of any type) which could be either set as synchronous or asynchronous, which meant that they would either provide the signal's value only when a new value is received, or provide the signal's value at any time when requested. Then, I would have classes that were derived from a System class. Each system object would run on its own thread, and would have a registry of input and output signal definitions (with properties like sync/async, required/optional inputs, name, type, etc.). Then, upon initialization of the structure of the software, I would connect the different systems by creating signals that acted as output of one system and input to one or many systems. In each system, the execution loop would first wait for all the synchronous signals to be received and then read out the values of the asynchronous signals, execute the body of the loop, set the output signals that were just computed, and repeat. This creates a chain of execution where many systems are at sleep for a while and wake up when new input signals arrive, and then trigger other systems into execution, and so on. This is the ideal situation to aim for, but my implementation of that wasn't the best, but it worked.

As far as mutex goes. Think about it, a mutex is needed when information has to be communicated between threads, in other words, each time a signal is updated. So, it makes all the sense in the world to have one mutex per signal. But implementing it can be tricky. I ended up using a spin-lock with two values per signal. Basically, each signal stored the last value that it got, and had an empty slot for the next value to come. So, because there is only a single "writer" for a signal, writing to the empty slot was non-blocking (no mutex protecting it). And then, reading the last value was protected by a mutex between the readers and the spinning of the values. So, multiple readers do not block each other, but they block the signal from being replaced by the new one. In other words, after receiving the signal value, I would take the first opportunity to swap the "last value" with the one that was just received. Although in retrospect, there probably should have been a mutex on the writing of the newly recieved signal too. But overall, this was great because it made the writing non-blocking and the reading essentially non-blocking too! But you have to watch out that all other operations within the signal object are atomic (e.g. you'll need some internal flags, make sure you operate atomically on them).

When you look at a simulation software like Simulink, you'll find that it appears like a signals and systems implementation, but it is not. First, all signals are synchronous in simulink. Second, simulink doesn't dynamically run each block as a "system" on its own thread, waiting for incoming signals, but it rather analyses the entire model and serializes it (a very simple thing to do, and that is the stage at which you get all these errors popping up). Finally, it runs the serialized model in one thread. So, it is important to understand that you don't really need a full-blown signals and systems framework, unless you need it to be configurable at run-time (while the simulation/controller is running), or if you just want your computations to be more distributed.

As for different types of signals (over a network, internal to a program, between programs (pipes), etc.), it is very easy to implement as classes derived from the basic Signal class (reimplementing its read/write with the same behavior (non-blocking, more or less) but with different protocols). If you need a more complex type of signal, you can also implement it as a system class that has no inputs (or just some parameters) and writes to a normal signal that is then connected to other systems. The possibilities are limitless with a good signals and systems implementation.

As a final note, you should not make systems that are too fine-grained. For example, consider the simple kind of blocks you find in Simulink, where some can be as simple as a matrix-vector product or something like that, then it would be really wasteful if that was implemented as a single system. You want each system to be reasonably complex such that the overhead of communicating information between systems is not as significant compared to the overall execution time. For instance, in my application, most of my controllers were composed of about 3-6 systems in total (e.g. one to deal with the hardware IO, one for dynamic compensation, one for high-level control, one for sensing collisions, etc.), and, for the model computations (inverse dynamics computations on the kinetostatic chain), I also used a highly modular setup, but this time, the computation was serialized ahead of time, for efficiency reasons. So, in your application, you should probably think of a way to allow for both, that is, have the option to take a sub-chain of elements of your simulation and serialize their execution (short-cutting the signals). This is not hard to do at all, it is very similar to file I/O serialization.


Thank you for the long and thoughful reply. I've been slightly diverted away from design a bit, so it may take me awhile to properly read and consider this when my tasks are redelegated, but I wanted to give you an initial show of gratitude before I can formulate a response.