I am trying to optimize code for Monte Carlo simulation. Even minute performance differences piles up after 100 million iterations and thus I need to squeeze every nanosecond from math operations!
One area where I thought I could save a lot stems from the fact that I only require precision of 4 significant digits. It therefore seems natural to use float rather than double.
However, some testing suggests that double still performs better! This is unexpected.
Why is it that despite the fact that float is 32 bits and double 64 bits, mathh functions are quicker to perform exp(double) and pow(double, double) than exp(float) and pow(float, float) (or even expf and powf)? Here is some code...

``````#include <math.h>
#include <iostream>
#include "Timer.h"

int main()
{
double a = 23.14;
float c = 23.14;
Timer t;
t.tic();
for (int i = 0; i < 10000000; i++)
expf(c);
cout<<"expf(float) returns " << expf(c)<<" and took "<<t.toc()<< " seconds." << endl;
t.tic();
for (int i = 0; i < 10000000; i++)
exp(c);
cout<<"exp(float) returns " << exp(c)<<" and took "<<t.toc()<< " seconds." << endl;
t.tic();
for (int i = 0; i < 10000000; i++)
exp(a);
cout<<"exp(double) returns " << exp(a)<<" and took "<<t.toc()<< " seconds." << endl;
}``````

## All 5 Replies

>>However, some testing suggests that double still performs better! This is unexpected.

Yup. floats are always converted to doubles when used as function parameters. Other factors may influence it too, such as the math coprocessor on your computer.

Yup. floats are always converted to doubles when used as function parameters.

No, that's only true if
(1) you call the function without a prototype in scope, or
(2) it's a variable argument to a variadic function like 'printf'.

commented: you are right +25

Dave: Yes, I see you are correct. I wrote a short test program and had the compiler produce assembler instructions, which showed the same behavior as what you posted.

floats are made for spaced optimization, it does not necessarily have to
be faster than double, especially in a 64bit CPU.