Definition:
67,67,67,67,67,67,67,67
43,43,50,55,66,90,91,97
Types of
Measures of Dispersions:
Measures of dispersion are of two types:
(i) Measures of Absolute Dispersion, and
(ii)
Measures of Relative Dispersion.
(i) Measures of Absolute Dispersion: The actual variation or dispersion determined by the Measures of Absolute Dispersion is called ‘absolute dispersion’.
(ii)
Measures
of Relative Dispersion: The measures of
absolute dispersion cannot be used to compare the variation of two or more
series. For e.g., the SD of the
height of students (in inches) cannot be compared with the SD of weights (in
pounds). Even if the units are
identical, for e.g., the comparison of height of men (in inches) and length of
their noses (in inches). If the SD
of heights of man is greater than the SD of their nose lengths, it does not mean
that the degree of variability is greater in case of heights.
To compare the
variation of two or more series, we need a measure of relative dispersion.
It is defined as:
Types of
Measures of Absolute Dispersion:
(a) The Range,
(b) The Quartile Deviation,
(c) The Mean Deviation, and
(d)
The Standard Deviation.
(a)
The Range:
1.
The range is the simplest measure of dispersion.
It is defined as the difference between the largest value and the
smallest value in the data:
2.
For grouped data, the range is defined as the difference between the
upper class boundary (UCB) of the highest class and the lower class boundary (LCB)
of the lowest class.
(b)
Quartile
Deviation (QD):
1.
It is also known as the SemiInterquartile Range.
The range is a poor measure of dispersion where extremely large values
are present. The quartile deviation
is defined half of the difference between the third and the first quartiles:
2.
The difference between third and first quartiles is called the
‘InterQuartile Range’.
(c)
Mean Deviation
(MD):
1.
The MD is defined as the average of the deviations of the values from an
average:
It is also
known as Mean Absolute Deviation.
2.
MD from median is expressed as follows:
3.
For grouped data:
1. The SD is defined as the positive Square root of the mean of the squared deviations of the values from their mean.
2.
Thus, the SD of population of N values, x_{1}, x_{2},
….. x_{n} is expressed as follows:
 Population Standard Deviation
3.
In case of a frequency distribution with x_{1}, x_{2},
….. , x_{k} as class marks, and f_{1}, f_{2}, ……,
f_{k} as the corresponding class frequencies, the SD is expressed as
follows:
Alternate Method for Computing
Standard Deviation:
 for ungrouped data (population SD)
 for grouped data
Where u
=
The Variance:
The variance is defined as the square of
the SD, i.e., the mean of the squared deviations from mean:
 for ungrouped data (population variance)
 for grouped data
Sample Variance and Standard
Deviation:
Alternate Method:
Properties of SD and Variance:
SD(x + a) = SD(x); var(x + a) = var(x)
SD(x
– a) = SD(x); var(x – a) = var (x)
SD(ax) = a × SD(x); var(ax) = a × var(x)
SD(x/a)
= (1/a) × SD(x); var(x/a) = (1/a) × var(x)
Var(x + y) = Var(x) + Var(y)
Var(x
– y) = Var(x) – Var(y)
is a minimum when
(i) the interval to includes 68.27% of the values,
(ii) the interval to includes 95.45% of the values, and
(iii)
the interval
to
includes 99.73% of the values.
The above results also hold approximately for moderately skewed distributions.
Characteristics
of Measures of Dispersion:
(a)
Range:
1. The range is simple to understand and easy to calculate because its value is determined by the two extreme items.
2. It is useful as a rough measure of variance.
3. Its value may be greatly changed if an extreme value (either lowest or highest) is withdrawn or a fresh value is added. It is a highly unstable measure of variation.
4.
It gives no indication how the values within the two extremes are
distributed.
(b)
Quartile
Deviation:
The QD is simple to understand and easy to calculate.
As a rough measure of variation, it is superior to the range because it is not affected by extreme values.
It is not capable of algebraic manipulation.
It
is mainly used in situations where extreme values are thought to be
unrepresentative.
(c)
Mean Deviation:
The MD is simple to understand and to interpret.
It is affected by the value of every observation.
It is less affected by absolute deviations than the standard deviation.
It
is not suited to further mathematical treatment.
It is, therefore, not as logical as convenient measure of dispersion
as the SD.
(d)
Standard
Deviation:
The SD is affected by the value of every observation.
The process of squaring the deviations before adding avoids the algebraic fallacy of disregarding signs.
In general, it is less affected by fluctuations of sampling than the other measures of dispersion.
It has a definite mathematical meaning and is perfectly adaptable to algebraic treatment.
It has great practical utility in sampling and statistical inference.
The
SD is the best general purpose measure of dispersion and should be employed
in all cases where a high degree of accuracy is required.
Example:
Class Boundaries 
Frequency 
9.519.5 
5 
19.529.5 
8 
29.539.5 
13 
39.549.5 
19 
49.559.5 
23 
59.569.5 
15 
69.579.5 
7 
79.589.5 
5 
89.599.5 
3 
99.5109.5 
2 
Total 
100 
Calculate:
(a) Range
(b) Quartile deviation
(c) Mean deviation from mean
(d) Standard deviation
(e)
Variance
Solution:
CB

f 
CF

x

fx






9.519.5 
5 
5 
14.5 
72.5 
37.7 
37.7 
188.5 
1421.29 
7106.45 
19.529.5 
8 
13 
24.5 
196 
27.7 
27.7 
221.6 
767.29 
6138.32 
29.539.5 
13 
26 
34.5 
448.5 
17.7 
17.7 
230.1 
313.29 
4072.77 
39.549.5 
19 
45 
44.5 
845.5 
7.7 
7.7 
146.3 
59.29 
1126.51 
49.559.5 
23 
68 
54.5 
1253.5 
2.3 
2.3 
52.9 
5.29 
121.67 
59.569.5 
15 
83 
64.5 
967.5 
12.3 
12.3 
184.5 
151.29 
2269.35 
69.579.5 
7 
90 
74.5 
521.5 
22.3 
22.3 
156.1 
497.29 
3481.03 
79.589.5 
5 
95 
84.5 
422.5 
32.3 
32.3 
161.5 
1043.29 
5216.45 
89.599.5 
3 
98 
94.5 
283.5 
42.3 
42.3 
126.9 
1789.29 
5367.87 
99.5109.5 
2 
100 
104.5 
209 
52.3 
52.3 
104.6 
2735.29 
5470.58 
Total 
100 


5220 


1573 

40371 
(a)
Range:
(b)
Quartile Deviation:
(c)
Mean Deviation from Mean:
(d)
Standard Deviation:
(e)
Variance:
Types of
Measures of Relative Dispersions:
(a) Coefficient of Variation,
(b) Coefficient of Dispersion,
(c) Quartile Coefficient of Dispersion, and
(d)
Mean Coefficient of Dispersion.
(a)
Coefficient of
Variation (CV):
1.
Coefficient of variation was introduced by Karl Pearson.
The CV expresses the SD as a percentage in terms of AM:
 for sample data
 for population data
2. It is frequently used in comparing dispersion of two or more series. It is also used as a criterion of consistent performance, the smaller the CV the more consistent is the performance.
3. The disadvantage of CV is that it fails to be useful when is close to zero.
4. It is sometimes also referred to as ‘coefficient of standard deviation’.
5. It is used to determine the stability or consistency of a data.
6.
The higher the CV, the higher is instability or variability in data, and
vice versa.
(b)
Coefficient of
Dispersion (CD):
If X_{m}
and X_{n} are respectively the maximum and the minimum values in a set
of data, then the coefficient of dispersion is defined as:
(c)
Coefficient of
Quartile Deviation (CQD):
1.
If Q_{1} and Q_{3} are given for a set of data, then (Q_{1}
+ Q_{3})/2 is a measure of central tendency or average of data.
Then the measure of relative dispersion for quartile deviation is
expressed as follows:
2.
CQD may also be expressed in percentage.
(d)
Mean
Coefficient of Dispersion (CMD):
The relative
measure for mean deviation is ‘mean coefficient of dispersion’ or ‘coefficient
of mean deviation’:
 for arithmetic
mean
 for median
Example:
(Take the previous example)
Calculate:
(a) Coefficient of Variation,
(b) Coefficient of Dispersion,
(c) Quartile Coefficient of Dispersion, and
(d)
Mean Coefficient of Dispersion
Solution:
(a)
Coefficient of Variation:
(b)
Coefficient of Dispersion:
(c)
Quartile Coefficient of Dispersion:
(d)
Mean Coefficient of Dispersion:
Example:
During a soccer tournament, two players
make the following series of goals:
Player 1 
2 
2 
4 
3 
2 
4 
2 
3 
Player 2 
1 
2 
5 
5 
5 
2 
1 
1 
Who is more consistent player?
Solution:
x 
y 




2 
1 
0.75 
0.5625 
1.75 
3.0625 
2 
2 
0.75 
0.5625 
0.75 
0.5625 
4 
5 
1.25 
1.5625 
2.25 
5.0625 
3 
5 
0.25 
0.0625 
2.25 
5.0625 
2 
5 
0.75 
0.5625 
2.25 
5.0625 
4 
2 
1.25 
1.5625 
0.75 
0.5625 
2 
1 
0.75 
0.5625 
1.75 
3.0625 
3 
1 
0.25 
0.0625 
1.75 
3.0625 
22 
22 

5.5 

25.5 
;
;
;
Conclusion:
The higher the CV, the higher the instability, and vice versa.
From the above calculations, it is evident that Player 1 is more
consistent than Player 2.
Standard
Scores or ZScores:
Raw data can be converted into a special
type of values by subtracting the mean from each value and then dividing by the
SD of the data. These values are
called ‘standard scores’ or ‘zscores’ or ‘values in SD units’:
 for sample data
 for population data
Properties of ZScore:
1. Zscores are free of units.
2. The mean of zscores is always zero.
3. The SD of zscores is always one.
4.
The distribution of zscores looks exactly the same as the distribution
of original data.
Example:
A student gets 82 marks in a final
examination in Accounting; the mean is 75 marks with a standard deviation of 10
marks. In Economics, he gets 86
marks in the final examination on which the mean is 80 marks with a SD of 14
marks. Is his relative standing
better in Accounting or Economics?
Solution:
Accounting 
Economics 


S = 10 
S = 14 
x = 82 
x = 86 


Conclusion:
His marks in Accounting are 0.7 SD above the mean, while in Economics his marks
are 0.43 SD above the mean. Therefore, his relative standing in Accounting is higher than
Economics.
Chebyshev’s
Theorem:
1. A Russian mathematician P.L. Chebyshev has devised a rule called ‘Chebyshev’s Theorem’ to determine the minimum proportion of values in intervals that are equidistant from mean.
2. The theorem states that for any data at least of the values must lie within k standard deviations on either side of the mean, where k is any constant number greater than 1.
3.
In other words, the interval
will contain at least
of the values.
For example:
will contain 75% of the values (k=2)
will contain 88.88% of the values (k=3)
will contain 82.64% of the values
(k=2.4)
Limitations
of Chebyshev’s Theorem:
1. Proportions of values are given only for intervals which are equidistant from mean, that is the mean should always be the midpoint of the interval.
2. Minimum proportion is specified rather than exact or approximate value of the proportion.
3.
Proportions for values of k less than or equal to one cannot be
determined.
Example:
Two populations have the same mean
. Their SDs are
. Find the percentages of the
values that must lie between 125 and 155.
Solution:
Population 1 
Population
2 














Therefore
125 to 155 will contain at least:

Therefore 125 to 155 will contain at least:

Normal
Distribution:
Interval 
Percentage
of Values 

68% 

95% 

99.7% 
Linear Transformation of a
Variable:
and
Since
or
Example:
Given:
.
Determine the mean and standard
deviation of the following transformations of x:
(i)
(ii)
Solution:
(i)
:
Rules:
SD(x + a) = SD(x)
SD(ax) = a × SD(x)
(ii)
:
Rules:
SD(x + a) = SD(x)
SD(ax) = a × SD(x)