Microsoft researcher explores bug reduction using coder biometrics

happygeek 1 Tallied Votes 292 Views Share

Bugs are, and always have been, a fact of life for the software developer. However, if Microsoft researcher Andrew Begel has his way, they could be a thing of the past. Last month a paper entitled 'Using Psycho-Physiological Measures to Assess Task Difficulty in Software Development' was published which Begel co-authored. This week, Begel spoke at the annual Microsoft Research Faculty Summit on the subject.

Basically what Begel and his research colleagues are saying is that the existing work looking at dealing with programming errors tends to focus on the "post hoc identification of correlations between bug fixes and code" and this isn't working. Instead, his team suggests, a new approach is needed to address the very real and very costly problem of code bugs. The new approach in question being to try and "detect when software developers are experiencing difficulty while they work on their programming tasks" and then, of course, stop them before they can go on to introduce bugs into their code.

This makes sense, as far as addressing the reasons why errors are introduced in the first place. Think about it, as a developer you are often asked to work very long hours with an unmovable deadline to be met. Your work involves staring at a screen for hours on end, producing something that is part of a finished product which can contain millions of lines of code. Combine the physical and mental stress and it's hardly surprising that errors are made, and then drown in that sea of code.

Instead of attempting to find yet another method of detecting bugs in the actual code, the new research proposes that the effort is focussed on the programmers themselves. The proposed method uses data taken from various psycho-physiological sensors including an eye-tracker, electrodermal activity sensor and electroencephalography sensor. The researchers undertook a study of 15 professional programmers using these biometric devices and they found that it was possible to predict nominal task difficulty for a new developer with 64.99% accuracy and for a new task with 84.38% accuracy.

This sounds positive, but it does raise some interesting ethical questions not to mention purely practical ones. Let's start with the practical first. So far the research has not really explained how those task difficulty levels can be interpreted as likely to cause errors across many real world scenarios. Different coders will work differently; some thrive on stress while others are weakened by it. Nor has the research gone as far as any means of intervention when stress levels enter the code danger zone, which means that we don't know how your average programmer is going to react when told to 'stop coding now' in mid-flow or how that is actually going to result in better, or at least less buggy, code. Begel has spoken about preventative interventions being possible, such as reducing screen contrast to make it harder to read the code and therefore forcing developers who are not paying enough attention to what they are doing to focus harder on the job at hand. The example he gives is when you have just come back from lunch and are maybe not as work-focused as you would be before your break.

The ethical questions are even harder to address. Let's start by considering what the average person, or even the average programmer, might consider to be 'reasonable means' when it comes to reducing errors in output. "I think constantly monitoring the psychological status and the physical conditions of programmers, seems tremendously intrusive and probably strays way off from what I consider to be reasonable means" says Amichai Shulman, the CTO at Imperva, who's interested mainly in the security implications of vulnerable code. He goes on to say "One of the main reasons for software flaws today is that programmers are constantly under pressure of delivering more functionality in less time. Quite frankly, if we peel-off all pretty words and new age HR terminology, programmers are appreciated by the number of lines of codes (LOC) they produce per second."

One can argue that on the way to achieving higher rates of LOC/sec, programmers and employers alike are sacrificing attributes of the code such as efficiency, readability and correctness. All of which are assumed to be capable of being caught and corrected further along the production process, if they are even determined critical enough to bother fixing that is. Introducing further delays into the coding process itself would seem, at face value, to be counter-productive. A stop-start system of stress-based interventions in the process will surely just introduce more stress and more interventions and, ultimately, more errors into the code.

So, DaniWeb members, where do you stand on this? Would you be happy to be monitored in such a way while you code? Do you think that stress-related interventions would make you a somehow better coder producing better code or would it just impede your ability to work properly?

Hiroshe 499 Posting Whiz in Training

I would have to agree with the practical critisism. Most of the bugs happen when I'm not having difficulty to be honest. Usually it will be something simple like not getting a boundary correct, or if I miss something as small as endianess, or to make a point, even if I make an assumption about bounding when I'm implementing the simplex algorithm. None of these cause me particular difficulty. Most of the bugs are caused by "jumping the gun".

Conversly, when I'm programming something difficult, I'll slow down and thing harder about everything I do. Even then, the bug is usually something overlookable (though in this case, it's often also logical even if I am carefull).

Alone, measuing difficulty might be an interesting measure for humanitarian research. That is an interesting idea. That would prehaps illuminate where programmers are having the most difficulty, and how the development proccess can be modified to minimize it. That's something usefull. But the research did not demonstrate the corralation between using the system and the number of bugs on a project. Further more, I don't see how this will tell programmers where the bugs are.

Another thing to note is when they say that the conventional post-hoc isn't working. I personally find that it works well to be honest. Heres a mental experiment: Imagine removing QA and unit tests. More then half of the time I spend developing is based on fixing things that those have reveiled. I wouldn't have much confidence in my code with those gone. Now imagine that I was hooked up to a machine that told me how much difficulty I was having. In all honesty, I wouldn't say that the machine is usefull, and I would feel "naked" without the extra assurence of the unit tests.

Morally, I don't really care as much as long as that the information is private to the programmer, and that he has the option of not using it. I would probably use it if it were used to try and improve the workflow and the development proccess - though I have douts about it's usage for preventing bugs themselves.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.