How to increase Precision in floating point arithematic

Question

chiraag 0 Newbie Poster

15 Years Ago

Hi there all,

Could someone please tell me how I could increase my precision for floating point arithematic?

My requirement is that I add a very small value of the order 10^-7 with a relatively big value, say 36.63 and then I want multiply it with 10^7. The problem Im facing using float nos alone is that when i sum up the numbers i get 36.63 itself and the former is not padded to the value 36.63.
My problem here make a significant difference in values because rigtht after the summing i multiply it with 10^7.

Could somebody tell me how I could go about this problem?

Thanks in advance

c++

4 Contributors
16 Replies
905 Views
6 Days Discussion Span
Latest Post 15 Years Ago Latest Post by vali82

All 16 Replies

mrnutty 761 Senior Poster

15 Years Ago

Try cout.precision(25);

Salem 5,265 Posting Sage

15 Years Ago

> My requirement is that I add a very small value of the order 10^-7 with a relatively big value, say 36.63
floats have about 6 decimal digits of precision, doubles about 15.
10^-7 is more than 6 digits away from 36.63 so your really small number is effectively zero.

Using double will buy you some head room, but won't solve the underlying problem.

Two choices
- use a math library with arbitrary precision like GMP
- rearrange your expressions so that precision is preserved as long as possible.
Eg.

float small[10] = { };
float big;
for ( i = 0 ; i < 10 ; i++ ) big += small[i];

would become

float small[10] = { };
float big;
float smallish = 0;
for ( i = 0 ; i < 10 ; i++ ) smallish += small[i];
big += smallish;

Each small by itself is too small to affect big, but by combining them all together you end up with something which can affect it.

mrnutty 761 Senior Poster

15 Years Ago

Copied and pasted and the value I got was : 0.0004254 for PD

mrnutty 761 Senior Poster

15 Years Ago

e lambda=5.32*10^-07

You do know that '^' is called XOR and is not "to the power of"
operator ?

10 ^ -07 != 0.0000001;

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

vali82 1 Light Poster · Answer 1 · 2009-08-18T14:57:24+00:00

vali82 1 Light Poster

15 Years Ago

use double :)

chiraag 0 Newbie Poster · Answer 2 · 2009-08-18T15:39:41+00:00

I guess you did not understand the problem.

I am adding two numbers-1.37*10^-5+6.3, in C++ when we do floating point arithmetic it automatically truncates the sum. And this is where I have the problem, I don't want it to truncate at all. because after i get the sum, i multiply it with 10^7,so if it truncates there is a significant difference in values. Could somebody tell me how to go about this problem?

Thank you.

vali82 1 Light Poster · Answer 3 · 2009-08-18T21:41:45+00:00

I guess you did not understand the problem.
I am adding two numbers-1.37*10^-5+6.3, in C++ when we do floating point arithmetic it automatically truncates the sum. And this is where I have the problem, I don't want it to truncate at all. because after i get the sum, i multiply it with 10^7,so if it truncates there is a significant difference in values. Could somebody tell me how to go about this problem?
Thank you.

There's a problem with your compiler friend !
I just tried the sum and it doesn't truncate anything. (VS 2008)

Float has a range of 3.4E +/- 38 so it's way more than you need !

Maybe it you paste the code you use ...

chiraag 0 Newbie Poster · Answer 4 · 2009-08-19T08:45:56+00:00

Hi Vale82 and Salem,
I will show my code to show whats happening.

Nx=1280;
Ny=960;
double pinhole_ccd_D_rec=6.3;
double pinDrec=pow(pinhole_ccd_D_rec,2);
double k=2*pi/lamda;[1.18105e+07]
 for (m=0;m<Nx;m++)
   { for(n=0;n<Ny;n++)
 	{ 	x[n][m]=(1+m-Nx/2); y[n][m]=(1+n-Ny/2);
	        r[n][m]= pow((x[n][m]*dx),2)+ pow((y[n][m]*dy),2);
                PD=k*sqrt(r[n][m]+pinDrec);
            }
   }

My ans to PD is more or less the same because of the truncation.
First value of r[n][m]-1.37*10^-5,pinDrec=36.96, I take the sum, take the sqrt and then multiply with a huge value k. So even the small changes should be reflected in the sum r[n][m]+pinDrec. I am using VC++ 6.0.

Could you guide me where I am going wrong

vali82 1 Light Poster · Answer 5 · 2009-08-19T12:19:07+00:00

Hi chiraag,

I've tested with a somewhat simplified version of your example:

const int Nx = 1280;
    const int Ny = 960;

    double x;
    double y;
    double r;

    double pinhole_ccd_D_rec = 6.3;
    double pinDrec = pow(pinhole_ccd_D_rec,2);
    double pi = 3.14, lamda = 1.18105e+07;

    double k = 2 * pi /lamda;//[1.18105e+07]
    double dx = 1, dy = 1;
    double PD = 0;

    for (int m = 0; m < Nx; m++)
    { 
        for(int n = 0; n < Ny; n++)
        { 	
            x = (1 + m- Nx / 2); 
            y = (1 + n- Ny / 2);

            r= pow((x * dx),2) + pow((y * dy), 2);

            PD = k * sqrt(r + pinDrec);
        }
    }

The thing is that PD is quite different every time. Maybe my version of the code is missing something ?!

My advice would be to get a new compiler ( download the express VS 2008 for C++, it's free on the microsoft site ) and try your code in that one. VS 6.0 has a LOT of bugs and even if the 2008 version still isn't working for you it's still an upgrade you MUST make!

One more thing, when you're testing the values, I hope you're in debug and looking directly at them! If you're printing them out on the console ... all bets are off :)

chiraag 0 Newbie Poster · Answer 6 · 2009-08-20T08:41:15+00:00

Hi chiraag,
I've tested with a somewhat simplified version of your example:
const int Nx = 1280;
    const int Ny = 960;

    double x;
    double y;
    double r;

    double pinhole_ccd_D_rec = 6.3;
    double pinDrec = pow(pinhole_ccd_D_rec,2);
    double pi = 3.14, lamda = 1.18105e+07;

    double k = 2 * pi /lamda;//[1.18105e+07]
    double dx = 1, dy = 1;
    double PD = 0;

    for (int m = 0; m < Nx; m++)
    { 
        for(int n = 0; n < Ny; n++)
        { 	
            x = (1 + m- Nx / 2); 
            y = (1 + n- Ny / 2);

            r= pow((x * dx),2) + pow((y * dy), 2);

            PD = k * sqrt(r + pinDrec);
        }
    }
The thing is that PD is quite different every time. Maybe my version of the code is missing something ?!
My advice would be to get a new compiler ( download the express VS 2008 for C++, it's free on the microsoft site ) and try your code in that one. VS 6.0 has a LOT of bugs and even if the 2008 version still isn't working for you it's still an upgrade you MUST make!
One more thing, when you're testing the values, I hope you're in debug and looking directly at them! If you're printing them out on the console ... all bets are off :)

Could you please tell me what were the values of PD? Because I tried it on both VC++ 6 and 9..and both the versions i have the same problem. My value is 7.44061*10^7, the value remains the same for the whole for loop. Were you getting different values throughout? I am really confused in this regard.

chiraag 0 Newbie Poster · Answer 7 · 2009-08-21T12:58:54+00:00

Could you please tell me what were the values of PD? Because I tried it on both VC++ 6 and 9..and both the versions i have the same problem. My value is 7.44061*10^7, the value remains the same for the whole for loop. Were you getting different values throughout? I am really confused in this regard.

Hi ,

I went through the code that I posted, I made a mistake in typing the lambda value. lambda=532*10^-07 You are getting the value of PD different in every case because the k term is no longer 10^7 value. k=1.18105*10^7 r values in the code are of the order 10^-05, the r values are negligible when we add it to 36.96 and then we when we multiply with a value of the order 10^07, there is a significant difference in values if we do not consider that truncation as well.

This is what I was asking about. How can I avoid the truncation that occurs when I do the step r[n][m]+pinDrec ?

Could somebody please help me with this problem?

vali82 1 Light Poster · Answer 8 · 2009-08-21T13:52:41+00:00

Could you post a short, compilable code example where I could see the problem ?

chiraag 0 Newbie Poster · Answer 9 · 2009-08-21T14:35:56+00:00

Could you post a short, compilable code example where I could see the problem ?

My code is given below. PD value is where I have my problem.

Nx=1280;
Ny=960;
double pinhole_ccd_D_rec=6.3;
double lambda=5.32*10^-07
double pinDrec=pow(pinhole_ccd_D_rec,2);
double k=2*pi/lamda;//[1.18105e+07]
 for (m=0;m<Nx;m++)
   { for(n=0;n<Ny;n++)
 	{ 	x[n][m]=(1+m-Nx/2); y[n][m]=(1+n-Ny/2);
	        r[n][m]= pow((x[n][m]*dx),2)+ pow((y[n][m]*dy),2);
                PD=k*sqrt(r[n][m]+pinDrec);
            }
   }

thank you.

chiraag 0 Newbie Poster · Answer 10 · 2009-08-22T08:37:03+00:00

e lambda=5.32*10^-07
You do know that '^' is called XOR and is not "to the power of"
operator ?
10 ^ -07 != 0.0000001;

yes i do know..

Nx=1280;
Ny=960;
double pinhole_ccd_D_rec=6.3;
double lambda=5.32*pow(10,-7);
double pinDrec=pow(pinhole_ccd_D_rec,2);
double k=2*pi/lamda;//[1.18105e+07]
 for (m=0;m<Nx;m++)
   { for(n=0;n<Ny;n++)
 	{ 	x[n][m]=(1+m-Nx/2); y[n][m]=(1+n-Ny/2);
	        r[n][m]= pow((x[n][m]*dx),2)+ pow((y[n][m]*dy),2);
                PD=k*sqrt(r[n][m]+pinDrec);
            }
   }

mrnutty 761 Senior Poster · Answer 11 · 2009-08-22T08:58:27+00:00

Is it possible to post your complete code, if its not to big?

vali82 1 Light Poster · Answer 12 · 2009-08-24T11:42:58+00:00

vali82 1 Light Poster

15 Years Ago

Yes, please post a short test code that compiles.

How to increase Precision in floating point arithematic

Recommended Answers Collapse Answers

All 16 Replies

Recommended Answers