Double variable type, unexpected answer

Question

Freaky_Chris 299 Master Poster

16 Years Ago

Hi, i'm doing some work with double atm and i have what appears to be a simple expression. When i work this out in my head i get the answer to be 0. However running the code gives me a very obscure value -5.77338e-017

Any help as to what i am doing wrong, P.S the answer i am atfer is 0

double a = 0.96;
double b = 0.9;
double com = 30.00;
std::cout << a - ((2.00*(com/1000.00)) + b);

Thanks,
Chris

c++

6 Contributors
7 Replies
262 Views
14 Hours Discussion Span
Latest Post 16 Years Ago Latest Post by StuXYZ

Ancient Dragon 5,243 Achieved Level 70

16 Years Ago

No fix that I know. Try multiplying everything by 100 and use int math.

Freaky_Chris commented: thanks +1

ArkM 1,090 Postaholic

16 Years Ago

It's not an "old compiler" issue, it's normal behaviour for floating point data calculations. You have not the right to an "exact" result for these calculations. Only integral numerical types get "exact' results. The "obscure" value -5.77338e-017 is a good approximation for least valued bits of double type mantissa. Now remember that 0.9 and 0.96 numbers (periodical binary fractions) have no exact double representation at all...

Salem commented: Yes, e-017 is pretty close to 0 +23

Freaky_Chris commented: Thanks for pointing this out +1

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 1 · 2008-12-04T18:25:06+00:00

Its a difference in compilers and the way they handle floating point arithmetic. VC++ 2008 Express produces 0. Dev-C++ produces the value you quoted. Dev-C++ is an old compiler now and may have a few bugs. Maybe someone with CodeBlocks can try it to see if the bug has been fixed.

Freaky_Chris 299 Master Poster · Answer 2 · 2008-12-04T18:27:20+00:00

Ah, so using Dev is there anyway that this can be solved? As I would hate to have to swap to VC++ just to solve this problem. Using Dev Allows me to do some stuff whilst i'm in college :D Where as i wouldn't be able to do anything if i had to use VC++ :/

I'm aware Dev is old...perhaps i should have a look at Code::blocks.

I just like Dev it suits me :P

Chris

cikara21 37 Posting Whiz · Answer 3 · 2008-12-04T19:47:24+00:00

cikara21 37 Posting Whiz

16 Years Ago

double x=a-((2*(com/1000))+b);

Salem 5,248 Posting Sage · Answer 4 · 2008-12-04T22:41:42+00:00

http://docs.sun.com/app/docs/doc/800-7895/6hos0aou4?a=view
Floats are approximations of an infinite number space mapped onto a finite machine. Mathematical results and computational results seldom agree. Any floating point result comes with an error margin.

> As I would hate to have to swap to VC++ just to solve this problem
If you count sweeping the problem under the carpet as a solution.

You might be printing zero, but does this?

double answer = a - ((2.00*(com/1000.00)) + b);
if ( answer == 0 ) {
  cout << "Zero" << end;
} else {
  cout << "NOT Zero" << end;
}

This is tricky stuff, as evidenced by this long drawn-out thread.
http://www.daniweb.com/forums/thread45388.html

StuXYZ 731 Practically a Master Poster · Answer 5 · 2008-12-05T03:04:16+00:00

Floating Point Result

I would like to comment that the issue of getting 0 or getting 5.77e-17 is mainly a CPU issue. Intel and AMD use 80bit registers. This hides and creates a lot of numerical rounding error.

This is a nightmare in numerical code since as the code gets interspersed or threads are used or recursion is used, you don't know when the compiler requires to move that 80bit temporary into a memory storage at 64bit. However, at 128 bit (long double) it is certain and that make life a little easier.

Additionally, if you use gcc/g++ then you can use -ffloat-store to avoid the register problem and that gives you the correct IEEE result of 5.77e-17 (long double) and 0 for (double). [Note that -ffloat-store can be a seriously CPU expensive option]

If you use long double then the results tend to be more IEEE based. If it matters to you then use the correct compiler flag.

Tolerance

I also would like to seriously caution you from using either fabs(A-value)<1e-10 or such similar device. This has a habit of begin very difficult to get right over a large range. There are several alternatives and I normally use the boost::test::tolerance class
http://www.boost.org/doc/libs/1_34_1/libs/test/doc/components/test_tools/floating_point_comparison.html.