Statistics::Descriptive gives deviant Standard Deviation?


I know next to nothing about statistics so tried Statistics::Basic because it seemed easy to use. Then I tried Statistics::Descriptive because it has functions that Statistics::Basic lacks. What I did not expect was to get a different result when calculating Standard Deviation using Statistics::Descriptive than when using the other module or a subroutine copied form Yahoo Answers. Am I doing something wrong, or does Statistics::Descriptive have a deviant way of calculating deviations?

use strict;
use warnings;

use Statistics::Basic qw(:all);
use Statistics::Descriptive;

my @d = (5,10,5,100,150);
print 'StdDev according to Basic is ', stddev(@d), "\n"; #Basic

my $stat = Statistics::Descriptive::Full->new();
print 'StdDev according to Descriptive is ', $stat->standard_deviation(), "\n"; #Descriptive

print 'StdDev according to subroutine is ', standard_deviation(@d) . "\n";

sub standard_deviation {
    my (@numbers) = @_;

    #Prevent division by 0 error in case you get junk data
    return undef unless ( scalar(@numbers) );

    # Step 1, find the mean of the numbers
    my $total1 = 0;
    foreach my $num (@numbers) {
        $total1 += $num;
    my $mean1 = $total1 / ( scalar @numbers );

    # Step 2, find the mean of the squares of the differences
    # between each number and the mean
    my $total2 = 0;
    foreach my $num (@numbers) {
        $total2 += ( $mean1 - $num )**2;
    my $mean2 = $total2 / ( scalar @numbers );

    # Step 3, standard deviation is the square root of the
    # above mean
    my $std_dev = sqrt($mean2);
    return $std_dev;

Gives the following output:

StdDev according to Basic is 60.12
StdDev according to Descriptive is 67.2123500556259
StdDev according to subroutine is 60.1165534607565

After further googling I found a post titled Perl Standard Deviation function is wrong that explains that there are at least two ways of calculating standard deviation which give noticeably different results for small data lists such as my examples use.

Conclusion: the values for standard deviation calculated by Statistics::Basic and Statistics::Descriptive differ for small data sets but this doesn't mean either value is wrong. What statistics module you use for calculating standard deviation depends on what calculation method you, your colleagues and your boss agree on.

Question Self-Answered as of 4 Years Ago
Isn't it about time forums rewarded their contributors?

Earn rewards points for helping others. Gain kudos. Cash out. Get better answers yourself.

It's as simple as contributing editorial or replying to discussions labeled or OP Kudos

This question has already been solved: Start a new discussion instead
Start New Discussion
View similar articles that have also been tagged: