Hi everybody,
I have a simple mathematical member function of a class:

struct shell
  double mc, ms, r0, v0, a0, w;
  shell(double mc, double ms, double r0, double v0, double w);
  double absv(double r) const;

shell::shell(double mc, double ms, double r0, double v0, double w)
  this->mc = mc;
  this->ms = ms;
  this->r0 = r0;
  this->v0 = v0;
  a0 = v0*v0 - (2*mc+ms)/r0;
  this->w = w;

double shell::absv(double r) const
  return sqrt((2*mc+ms)/r + a0 + 4*w*log(r/r0));

The function absv is called lots of times during my run,
therefore I decide to try to speed up if it is possible.
But my experiences are very strange and unexpected for me!


I introduce b0 = 2*mc+ms as a member variable,
and exchange absv to sqrt(b0/r + a0 + 4*w*log(r/r0));

But it is now slower! Why?


I add the following line to absv:

if (w == 0) return sqrt((2*mc+ms)/r + a0);

And its much more rapid for w=0, but why?
I expect that the * operator doesnt try to calculate log() if w=0,
becaase 0*anything = 0

>>I expect that the * operator doesnt try to calculate log() if w=0,

C++ cannot make those kinds for simplifications, because operators can be overloaded and thus, it cannot really know that a variable is 0, nor can it assume that 0 * anything = 0, because it might not if you overloaded the * operator to behave differently. C++ will evaluate both operands of the multiplication and then evaluate the product, whether one of the operands is zero or not. And, of course, evaluating a log() function is very expensive and so, it is not surprising, in fact expected, that checking if w == 0 will speed things up quite a bit.

>>But it is now slower! Why?

That's sort-of unexpected. I would not expect this simplification to increase performance much if at all. But if it is slower, there has to be something else going on. What is your test program that tells you this is slower? I would suspect you are inadvertently doing a copy of your "shell" object (which increased in size and thus a bit slower to copy).

When it comes to micro-optimizations like these, the first priority (after compiling with the highest optimization level) is to get rid of any useless transcendental expressions (like log(), exp(), cos(), sqrt(), pow(), etc.) because these are expensive to compute. For example, often, you can get rid of a sqrt root (for example, when computing the Euclidean norm, or distance) if you simply use the squared norm instead in the algorithm that requires this distance value. The second priority is to reduce as much as possible the number of floating-point operations (multiplications and additions) by rearranging the expression. You want to get rid of divisions, then multiplications, and then additions/subtractions (that's their order of time consumption). Finally, you want to improve memory locality as much as possible, i.e., make the memory access as sequential as possible, declare local variables where they are used, with minimal scope, etc. After that, you enter the world of nano-optimization, which is rarely needed at that point.

But it is now slower! Why?

At this granularity, it's probably system noise. Have you nailed down performance requirements, profiled your code, and determined that what you have is presently too slow?