Tuesday, May 1, 2012

Basic Statistics Terminology

Overview

Again I would apologise beforehand to any statistics specialists. In the past article we considered basic statistics terminology and addressed the expected value or mean of a series of determined values. Here, we expand this to the median, mode, variance and the helpful standard deviation. With any luck, this article will offer people a bit more grasp with the following terms.

Median:

In the recent article we ran through the 'mean' or 'expected value' of a series of values which we might measure. These were:

Values: 3, 5, 5, 6, 7, 9, 10, 11, 12, 12, 15, 16

Ordinarily, determined values would have a random sequence however, the above values have been sequenced from low to high.

The median value is the value at the center of the sequence that has 50 percent of its values higher and 50 % of the values lower.

In the occurrence above there is no 'middle' value since we hold 12 values. And so, we pick the 2 central values of 9 and 10 and average them to obtain 9.5. This is then the median value.

Had we used the values: 3, 5, 5, 6, 7, 9, 9, 10, 11, 12, 12, 15, 16, the middle value (of a total of 13) would be 9 as the median value.

Anytime values change a great deal the median can be beneficial as a tool that smoothes out the data. It can help to follow movements in the data by way of noting the median values. Data values may at that point be deemed a deviation from the median and may deliver a notion whether it is moving away from this trend.

If the median value is exactly the same as the expected value or mean then there is a balanced spread of values. If the median is greater or less than the mean or expected value, then the spread of the values will be biased either towards the left or right.

Mode:

Anytime it comes to basic statistics terminology this is simple. When we once more consider the values above measured and change one of them from 15 to 12 we have:

Values: 3, 5, 5, 6, 7, 9, 10, 11, 12, 12, 12, 16

The mode is that value that occurs the most times, in the above case it will be 12, which shows up 3 times.

Also there can be two modes or more in a series of data values.

Variance:

If you recall, the mean was additionally known as the 'expected value'. Each measured value will differ from this mean or expected value by a particular amount. The variance offers a notion of exactly how 'spread out the values are' when related to the expected value or mean.

The overall variance amounts to the average of the sum of individual variances.

The variance is calculated as the square of the deviation between it and the mean or expected value. For example:

Data values: 3, 5, 5, 6, 7, 9, 10, 11, 12, 12, 15, 16

Mean or expected value: (3 + 5 + 5 + 6 + 7 + 9 + 10 + 11 + 12 + 12 + 15 + 16)/12 = 111/12 = 9.25

If we look at the 4th value of 6 the variance will be:

Variance = (6-9.25) x (6-9.25) = (-3.25) x (-3.25) = 10.56

We could work this out for all of the values, total these up and then divide by the number of values, 12 to obtain the total variance.

We could use this concept for a simplified project activity delay in the last article:

Delay..........Probability..........Contribution

6......................0.3...................6 x 0.3 = 1.8

16....................0.5..................16 x 0.5 = 8.0

20....................0.2...................20 x 0.2 = 4.0

The expected value = 1.8 + 8.0 + 4.0 = 13.8 weeks

The overall variance will be the total of the individual variances divided by 3, the number of values.

Total variance = [(6-- 13.8) x (6-- 13.8) x 0.3 + (16-- 13.8) x (16-- 13.8) x 0.5 + (20-- 13.8) x (20-- 13.8) x 0.2]/3

= [(-7.8) x (-7.8) x 0.3 + (2.2) x (2.2) x 0.5 + (6.2) x (6.2) x 0.2]/3

= [(60.84 x 0.3) + (4.84 x 0.5) + (38.44 x 0.2)]/3

= (18.25 + 2.42 +7.69)/3

= 28.36/3

= 9.45

This provides a notion of the distribution of values in relation to the 'expected value'.

Note that in the above illustration the 'values' were arrived at by a professional's evaluation hinged on assumptions, thus these are not 'measured data values'. With respect to determined values we would have had to, actually, carry out the activity three times in exactly the same way and established that, on those three separate occurrences, the delays were 6, 16 and 20 weeks. This would not occur in the real world.

Standard deviation:

It is the square root of the variance. For the earlier example we obtain:

Standard deviation = √9.45 = 3.07

It is a very helpful value.

Anytime we determine data values there will occur a 68 % likelihood that all of the values will fall inside 1 standard deviation of the mean or expected value.

For the illustration above:

Expected value or mean = 13.8

Variance = 9.45

Standard deviation = 3.07

68 % of values will occur within (13.8-- 3.07) and (13.8 + 3.07) = 10.73 to 17.5

In a similar way 95 percent of the data values will fall inside 2 standard deviations and 99.7 % will fall within 3 standard deviations. And so, we would have:

2 standard deviations = 6.14

3 standard deviations = 9.21

95 % of values will fall inside (13.8-- 6.14) and (13.8 + 6.14) = 7.66 to 19.94

99.7 % of values will fall within (13.8-- 9.21) and (13.8 + 9.21) = 4.59 to 23.01

Ideally, this article has provided a small insight into a handful of statistical expressions.


----------------------------------------------------
We supply plain jargon free instruction covering a wide field for business and personal use. If you would like more help and advice don't ignore Project management at http://www.project-management-basics.com, or visit our product store at http://www.marchltd.co.uk


EasyPublish this article: http://submityourarticle.com/articles/easypublish.php?art_id=262044

No comments:

Post a Comment