Application of Measures of Central Tendency and Variability.
- Mode, Median and Mean.
- Range, Quartile and Standard Deviation.
- Application of measures of variability.
Measures of central tendency can be defined as a value which can be used to describe the middle of a distribution. Under here, we shall discuss the applications and when its most appropriate to employ the services of each of the measures of central tendency. These measures include;
Mode: the mode is the number with the highest frequency in a distribution. Given the distribution 2,4,3,4,2,2,3,2,5,6. The mode here is 2 because 2 appeared the most. If you have two mode is called a bimodal and if it is more than two is called a multi-modal.
The mode is the most reliable measures of central tendency when you are dealing with nominal or discrete variable.
Advantages of mode
- The mode is most appropriate when dealing with discrete variable
- It is very easy to compute
- It can be identified in a distribution through inspection
- The mode can also be identified through graphical representation.
Disadvantages of mode
- It cannot be used in the presence of continuous data
- In a situation where we have bimodal or multi-modal, the mode becomes inadequate.
- It is the most unreliable measures of central tendency when precision and accuracy are required.
Note: the most appropriate time to make use of the mode is when you are dealing with discrete data.
Measures of Central Tendency and Variability Application
Median: the median represents the middle number in a given distribution after arranging all the number in ascending or descending order. E.g the median in this distribution 30,30,40,70,80,85,90,95 and 100 is 80.
Note: the median is most appropriate in the presence of outliers. Outliers are extreme values or numbers that appears far away from other numbers or values in a distribution. For example, 2,3,4,5,6,70. 70 here is an outlier because the value is far from the other values.
So the median is the appropriate measures of central tendency to used here. Now in finding the median for the distribution, since 4 and 5 appeared to be the middle numbers, an arithematic average will be taken.
That is, the sum of 4 and 5 divided by 2. 4+5/2 = 4.5. so the median is 4.5.
Also, outliers result to what we called a skewed distribution. A distribution is skewed if it lacks normality. Skewness is the measures of dispersion or deviation from normality. Here we have positive skew and negative skew.
Kurtosis is the measure of the peak of a distribution. We have mesokurtic (average tall), platykurtic(flat) and leptokurtic(very tall).
Advantages of median
- It is very easy to compute or estimate since it involves one or two number in a distribution.
- The median can also be estimated from graphical distribution.
- The median can also be used when we have discrete data
Disadvantages of median
- In the presence of continuous data, the median is not reliable
- The median is not the most reliable measures of central tendency because its estimation just involves a single or two numbers in a given distribution.
This is the arithmetic average of all the number in the distribution. It is also the sum of all the scores in the distribution divided by the number in that distribution.
Note: the mean is most appropriate to make use of when we have a continuous data
Advantages of mean
- The mean is the most accurate measures of central tendency because its estimation involves all the numbers in the distribution.
- The mean borrowed itself to further statistical analysis like ANOVA, t- test etc.
Disadvantages of Mean
- It cannot be used in the presence of discrete or nominal data
- The mean cannot be gotten through inspection
- Also, in the presence of outliers or extreme numbers, the mean is not the most reliable measures of central tendency.
Measures of Variability (Dispersion)
Measures of variability or dispersion tell us how far the scores are spread from each other. Examples of measure of variability includes;
- Quartile range
- Standard deviation.
Range: this is the difference between the highest number or score and lowest number or score in a distribution. The range is the easiest measures of variability to compute and it is used when we want to get quick information in the given distribution. The range is the most unreliable measures of variability. For example; given two set of distribution below
|Group A||Group B|
Range of group A = 8 – 2 = 6 and Range of group B = 6 – 0 = 6. Notice that the two groups have the same range but their values in the distribution are not comparable.
These when the distribution is divided into four. Like Q1, Q2, Q3 and Q4.
Q1 – 25%, Q2 -50%, Q3 – 75% and Q4 – 100%.
The first quartile (Q1) is a point below which 25% of numbers in the distribution lies. The second Quartile (Q2) is a point below which 50% of numbers in the distribution lies. Third quartile(Q3) ) is a point below which 75% of numbers in the distribution lies. Fourth quartile(Q4) ) is a point below which 100% of numbers in the distribution lies.
For example, for an ungrouped distribution, the quartiles are determined as follows;
1 2 3 ⌠ 4 5 6 ⌠ 7 8 9 ⌠ 10 11 12 ⌠
(Q1) = 3.5 (Q2) = 6.5
(Q3) = 9.5 (Q4) = 12.5
In a situation when you have a repeated number or odd number. For Q1 = 25% * number of distribution. Your distribution must be arranged in ascending order.
Inter-quartile range (I.R) = Q3 – Q1
The semi inter- Quartile range is similar to that of the mean in the distribution = Q3 –Q1/2
Using the above value, S.I.R = 9.5 – 3.5/2 – 3.
In the case of grouped data, the formulas for finding the corresponding quartiles are given as;
Q1 = L+(N/4 –Fb) *C/F and Q3 = L+(3N/4 – Fb) *C/F where,
L = lower class boundary or limit
Fb = Cumulative frequency before the median class
C = class size
F = frequency of the quartile class. You can check how to group data here
Example 1: given the distribution
|Class interval||F||CF||Class boundary||Class mark(x)|
|1 – 5||3||3||0.5 – 5.5||3|
|6 – 10||4||7||5.5 – 10.5||8|
|11 – 15||4||11||10.5 – 15.5||13|
|16 – 20||3||14||15.5 – 20.5||18|
|21 – 23||3||17||20.5 – 23.5||23|
Now finding Q1 = 25% * 17 (which is the frequency)
= 25/100 * 17 = 4.25(the position)
L = 5.5, Fb = 3, N = 17, C = 5, F = 4
Using the formula Q1 = L+(N/4 –Fb) *C/F
= 5.5 +(17/4 – 3) * 5/5
Now finding Q1 = 75% * 17 (which is the frequency)
= 75/100 *17 = 12.75( the position)
L = 10.5, Fb = 7, F = 4, N = 17, C = 5
Using the formula Q3 = L+ (3N/4 –Fb) *C/F
= 10.5 + (3*17/4 – 7 ) * 5/4
Inter-quartile range = Q3 – Q1 = 17.7 – 7.06 = 10.64
Semi inter-quartile range = Q3 – Q1/2 = 10.64/2 = 5.32
Importance of Quartile
- The quartile is much more reliable than the range
- Once there is a high variation, the performance score are far from each other and vice versa.
These tell us how close the responses of individuals are as a group. There are different formulas one can use when calculating standard deviation. It all depends on the data set you are working with.
Characteristics of Standard Deviation
- The standard deviation is the most accurate measures of variability
- A high standard deviation indicates a high variability in scores and vice versa
- When the standard deviation is low and the mean is high, it indicate ‘mastery’ but when the standard deviation is high and the mean is low it means there is no mastery(i.e low understanding)
Application of Measures of Central Tendency and Variability.
Application of Measures of Variability
- The measures of variability such as the standard deviation can be used for further statistical operation or manipulation e.g z – score, ANOVA etc
- It helps to determine the difference in performance of students
- These can also help in preparing the area under the standard normal curve.
- It helps to determine the measures of kurtosis and skewness
- For determining or checking whether there is a mastery of course content or not
- Measures of variability such as the range, helps to determine the difference in quantity of goods produced in batches
share this: Application of Measures of Central Tendency and Variability