Help with program calculating mean and standard deviation

Question

Fuzzies 0 Newbie Poster

12 Years Ago

Hello! I just started learning C a few weeks ago and I'm having a problem with a program I'm writing

The program has to prompt the user for a certain number of values defined as NUM_VALUES, and then calculate the mean and standard deviation of the numbers.

Here's what I have so far:

    #include <stdio.h>
    #include <math.h>

    #define NUM_VALUES 5


   int main ()
    {
       int userinput[NUM_VALUES], i, j, k;
       double mean, mean_divided, standard_deviation, temp, temp2;

       printf("Enter 5 integers: ");

       for ( i = 1; i <= NUM_VALUES; i++ )

       scanf("%i", &userinput[NUM_VALUES]);
    /* Program will get input from user NUM_VALUE times */

       for ( k = 1; k <= NUM_VALUES; k++)
          mean += userinput[NUM_VALUES];
          mean_divided = mean / NUM_VALUES;

    /* Calculates mean of inputed values*/

       for ( j = 1; j <= NUM_VALUES; j++)
          temp += pow(userinput[NUM_VALUES] - mean_divided, 2);
          temp2 = temp / (NUM_VALUES - 1);
          standardi_deviation = sqrt(temp2);

    /* Calculates standard deviation of values*/
       printf("mean = %.3f, standard deviation = %.3f\n", mean_divided, standard_deviation);

       return 0;

    }

Right now the program is able to compile, but when I run it I'm not getting the results i'm looking for. For example, when I input 1 2 3 4 5, I should be getting mean = 3.000 and standard deviation = 1.581, but right now i'm getting mean = 6.000 and standard deviation = 0.000. I need help finding the error within the logic of my program.

Thanks beforehand!

c

3 Contributors
3 Replies
272 Views
2 Hours Discussion Span
Latest Post 12 Years Ago Latest Post by Fuzzies

Adak 419 Nearly a Posting Virtuoso

12 Years Ago

Line 14 is an example of your problem, for all your for loops:

for ( i = 1; i <= NUM_VALUES; i++ )

Which is wrong. correct is:

for ( i = 0; i < NUM_VALUES; i++ )

You need to MEMORIZE this idiom! This is C, and C has zero based array!

Edited 12 Years Ago by Adak

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

sbesch 2 Newbie Poster · Answer 1 · 2012-11-09T21:27:57+00:00

There are a few oddities. First, the values array is defined as containing 5 items. Then you index from 1 to 4, essentially throwing away the first value. This is probably where the arithmetic error comes from. However, there are two other things that strike me as odd. First, is the use of N-1. In all probability, you should be using N. In many cases, for example in calculating the st. Dev. for an exam score, where you have all the data points then you divide by N, not N-1. However, if you are sampling a larger population then there are data points not included in your sample which increases the uncertainty. In this case, dividing by N-1 increases the Std. Dev. by a little bit to reflect this increased uncertainty. This is a very common error made by a lot of people who use Std. Dev. In the case of exam scores, you are actually cheating the better performers and rewarding the poorer ones by increasing the Std. Dev. For classroom exams, never use N-1. In fact, never use N-1 if you have all of the samples, only when there are data points that you do not include in the analysis, i.e., a population SAMPLE rather than the entire population. At least add a question concerning this: Do you want the sample Std. Dev. or the Population Std. Dev. and then use N-1 or N as appropriate.

The second thing is your choice of computation formulation. If you expand the equation for Std. Dev., what you find is that you don't need to calculate the mean up front and therefore you don't need to store all the values unless you have some later need for them (printing to a list, for example). It is usually much more convenient to simply enter all the values, sum up the square of the inputs, and sum up the inputs themselves, counting them as you go. Then, when the last value is entered, signalled perhaps by entering something like CtrlE, compute the Std. Dev from:

SquaredSumXbyN=(SumOfX*SumOfX)/N
if Sample Std.Dev then N=N-1
Variance=(SumOfXSquared-SquaredSumXbyN)/N
StdDev=Sqrt(Variance)

(From http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance)

There is one caveat. In cases where the Variance is very small, the approach of saving all the data and using the method you have used, of first calculating the mean, then summing the deviations, will give a more accurate result. However, for almost all practical cases you will encounter, the method given here will give accurate results.

Fuzzies 0 Newbie Poster · Answer 2 · 2012-11-09T22:56:37+00:00

Thanks for the help! I fixed the i = 0 looping part but it turns out the real problem was my scanning of the array, which I did inproperly. Program works now!