London School of Commerce
Quantitative Techniques in Business
(QTB)
Lecture 1:
Descriptive Statistics
By:
Dr. David Acquaye
Contents
1.Mode:
1.1Definition and Illustration
1.2Estimating the mode from a grouped data
2.Mean
2.1Arithmetic sample mean:
2.2Geometric mean
2.3Harmonic mean
2.4Arithmetic Weighted mean
2.5Try these:
3.Standard Deviation and Variance
3.1Sample Standard Deviation ( :
3.2Population Standard Deviation ( :
3.3Examples on standard deviation
3.4Try these:
4.Median
4.1Median of ungrouped data
4.2Median for grouped data
5.Quartiles
5.1Lower Quartile
5.2Upper Quartile:
5.3Inter Quartile Range (QR) and Semi-Inter Quartile Range
5.4Example :
6.Graphs
1.Mode:
1.1Definition and Illustration
It is the item/object/group/ measurement etc. with the highest frequency.
For example:
2 ; 4 ; 6 ; 10 ; 2 ; 5 ; 2 ; 3 ; 2 ; 6
From the above set of numbers, the mode is 2 because it has the highest frequency (appears 4 times). Where two numbers appears more times we have a bimodal situation:
6 ; 2 ; 4 ; 6 ; 10 ; 2 ; 5 ; 2 ; 3 ; 2 ; 6 ; 6
There are two modes from the above set of numbers (2 and 6: they both appear 4 times).
1.2Estimating the mode from a grouped data
= Lower class boundary of the modal class
= The difference between the modal frequency and the previous class’ frequency (fm-fm-1)
= The difference between the modal frequency and the next class’ frequency (fm-fm+1)
w = Class size of the modal class
Examples:
1.The data below shows the marks obtained by some students in an examination. You are required to estimate the modal mark.
Mark(%) / Number of students0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5
2.The data shows the age distribution of students in a class. You are required to estimate the modal age of the class.
Age (years) / Frequency10 – 14 / 6
15 – 19 / 12
20 – 24 / 25
25 – 30 / 8
30 - 34 / 2
Note that the age distribution needs to be changed into class boundaries before the calculations could be done (subtract 0.5 from lower class boundaries and add 0.5 to upper class boundaries: eg: 9.5 – 14.5 ; 14.5 – 19.5 ; 19.5 – 24.5 etc.)
2.Mean
This is simply the average. There are several types of means: these include – arithmetic sample mean; geometric mean; harmonic mean and weighted mean
2.1Arithmetic sample mean:
Given the range
Example
Find the arithmetic mean of the following set of numbers
2 ; 4 ; 6 ; 10 ; 2 ; 5 ; 2 ; 3 ; 2 ; 6
2.2Geometric mean
Given the range
Example
Find the geometric mean of the following set of numbers:
2 ; 3 ; 4 ; 5 ; 6
This is often used when the type of data relates to percentage changes such : population changes; birth rates; death rates; price changes; share price changes etc.
2.3Harmonic mean
Example:
Find the harmonic mean of the following set of numbers
12 ; 3 ; 4 ; 5 ; 86; 4 ; 7 ; 8 ; 2
Harmonic Mean is often used when there are outliers in the data
2.4Arithmetic Weighted mean
This is suitable when the data is grouped. It is illustrated below:
The data below shows the marks obtained by some students in an examination. You are required to estimate the modal mark. (see example 1)
Mark(%) / Number of students0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5
The weighted mean is estimated as follows
Mark(%) / f / x / fx0-10 / 8 / 5 / 40
10-20 / 11 / 15 / 165
20-30 / 13 / 25 / 325
30-40 / 16 / 35 / 560
40-50 / 5 / 45 / 225
50-60 / 17 / 55 / 935
60-70 / 10 / 65 / 650
70-80 / 6 / 75 / 450
80-90 / 9 / 85 / 765
90-100 / 5 / 95 / 475
100 / 4590
2.5Try these:
1.Find the mean of the age distribution below:
Age (years) / Number of students0 – 4 / 0
5 – 9 / 0
10 – 14 / 0
15 - 19 / 3
20 – 24 / 28
25 – 29 / 14
30 – 34 / 11
35 – 40 / 4
[Note that the most suitable formula to use is the weighted mean because its grouped data]
2.Find the harmonic mean of the following set of data
i.200 ; 12 ; 4 , 16 , 11 , 14 , 10 , 2 ; 15
ii.100 ; 85 ; 92 ; 2 ; 86 ; 105 ; 89
3.Find the geometric mean of the following set of data:
i.20 ; 15 ; 11 , 16 and 19
ii.1.52 ; 1.83 ; 1.74 ; 1.3 ; 0.89 and 0.94
4.Using the data in Question 1 above; estimate the modal age.
3.Standard Deviation and Variance
The standard deviation tells us how true the mean is. The bigger the standard deviation the more untrue our mean is. The implication is that the mean will not be a good estimator. The standard deviation technically reveals the dispersion and variability in the data.
In business its often used as a measure of risk especially in share prices. It also used for the measurement of volatility of the financial market.
3.1Sample Standard Deviation ( :
[Sample Standard deviation]
3.2Population Standard Deviation ( :
3.3Examples on standard deviation
The data below show the weekly take home pay of employees in a suburb in London.
Weeklytake- home- pay (£’000) / Number of workers
100 – 110 / 3
110 – 120 / 7
120 – 130 / 12
130 – 140 / 6
140 – 150 / 8
150 – 160 / 20
160 – 170 / 11
170 – 180 / 7
180 – 190 / 4
190 - 200 / 2
You are required to estimate the standard deviation of pay distribution above.
Suggested format of Answer :
Take-home (000) / f / x / fx / fx2100 – 110 / 3 / 105 / 315 / 33075
110 – 120 / 7 / 115 / 805 / 92575
120 – 130 / 12 / 125 / 1500 / 187500
130 – 140 / 6 / 135 / 810 / 109350
140 – 150 / 8 / 145 / 1160 / 168200
150 – 160 / 20 / 155 / 3100 / 480500
160 – 170 / 11 / 165 / 1815 / 299475
170 – 180 / 7 / 175 / 1225 / 214375
180 – 190 / 4 / 185 / 740 / 136900
190 - 100 / 2 / 195 / 390 / 76050
80 / 11860 / 1798000
2.The data below shows the marks obtained by some students in an examination. You are required to estimate the standard deviation
Mark(%) / Number of students0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5
Suggested format of answer:
Mark(%) / f / x / fx / fx20-10 / 8 / 5 / 40 / 200
10-20 / 11 / 15 / 165 / 2475
20-30 / 13 / 25 / 325 / 8125
30-40 / 16 / 35 / 560 / 19600
40-50 / 5 / 45 / 225 / 10125
50-60 / 17 / 55 / 935 / 51425
60-70 / 10 / 65 / 650 / 42250
70-80 / 6 / 75 / 450 / 33750
80-90 / 9 / 85 / 765 / 65025
90-100 / 5 / 95 / 475 / 45125
100 / 4590 / 278100
The standard deviation is calculated as follows:
The Variance is calculated as follows:
There is a high degree of variability in the distribution.
3.4Try these:
1.The data below shows the sales recorded by different marketing agents of a firm.
Sales range (‘000) / Frequency5-10 / 10
10 - 15 / 14
15 – 20 / 8
20 – 25 / 18
25 – 30 / 20
30 – 35 / 8
35 – 40 / 11
40 – 45 / 9
45 – 50 / 2
You are required to estimate the
(a)Mode
(b)Mean
(c)Standard deviation
2.The data shows the age distribution of students in a class. You are required to estimate the modal age of the class [See question 2 of mode]
Age (years) / frequency10 – 14 / 6
15 – 19 / 12
20 – 24 / 25
25 – 30 / 8
30 - 34 / 2
You are required to calculate
(a)Mean
(b)Standard deviation
4.Median
It is the middle item when data is arranged in order of magnitude (smallest to the biggest or the vice-versa).
4.1Median of ungrouped data
Example 1: Find the median of the following set of numbers:
4 ; 2 ; 6 ; 5 ; 8 ; 1 ; 3
To find the median, the numbers need to be rearranged in order of magnitude:
1 ; 2 ; 3 ; 4 ; 5 ; 6 ; 8
The median is 4 because it is the number in the middle
Example 2 : Find the median of the following set of numbers:
5 ; 2 ; 6 ; 4 ; 8 ; 3 ; 7 ; 1
Re-arranging:
1 ; 2 ; 3 ; 4 ; 5 ; 6 ; 7 ; 8
The median is (4+5)/2 = 4.5
[this because there are two numbers in the middle]
4.2Median for grouped data
Lm = Lower class boundary of the median class
N = Total Frequency of distribution
= Summation of frequencies before the median class
Fm= Frequency of the median class
w = the size or width of the median class
Let’s use some of the above examples to estimate the median:
Mark(%) / Number of students0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5
The median is calculated as follows:
5.Quartiles
5.1Lower Quartile
= Lower class boundary of the Lower Quartile Class
= Total Frequency
= The sum of all frequencies before the lower quartile class
FQ1 = Frequency of the lower quartile class
w = Class size of the lower quartile class
5.2Upper Quartile:
= Lower class boundary of the Upper Quartile Class
= Total Frequency
= The sum of all frequencies before the upper quartile class
fQ3 = Frequency of the Upper quartile class
w = Class size of the upper quartile class
5.3Inter Quartile Range (QR) and Semi-Inter Quartile Range
1.Inter Quartile RangeQ3 – Q1
2.Semi-Inter Quartile Range/Quartile Deviation =
5.4Example :
Again let’s use some of the above examples to calculate the upper and lower quartile
Mark(%) / Number of students0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5
Lower Quartile:
Upper Quartile:
Quartile Deviation:
6.Graphs
Example 1
The data below shows the profitability ratios of five companies (A,B,C,D and E) from 2008-2010. These companies operate in the same sector.
Profitability Ratios from 2008-2010Company / 2008 / 2009 / 2010
A / 18 / 20 / 16
B / 12 / 12 / 14
C / 16 / 24 / 20
D / 25 / 20 / 16
E / 8 / 8 / 18
You are required to :
- Represent the above data with an appropriate graph
- Calculate the mean and standard deviation across the years and recommend one company for an investor :
Suggested Answer:
There so many graphs that can be used to represent the above information. These may include the following:
A divided or stacked column graph can also be used to represent the above data as below:
A line graph can also be used to represent the data as shown below:
Example 2
The data below shows the distribution of the labour force in a company according to occupational categories for 1990 and 1995.
Category / 1990 (%) / 1995(%)Administration / 10 / 15
Professional and Technical / 12 / 12
Skilled Manual / 24 / 24
Unskilled Manual / 40 / 24
Clerical / 14 / 25
Total (%) / 100 / 100
Represent the above information by means of pie charts for each year and comment on the distribution.
Example 3 :Mean and standard deviation examined
Profitability Ratios from 2008-2010Company / 2008 / 2009 / 2010 / Mean / SD
A / 18 / 20 / 16 / 18 / 2
B / 12 / 12 / 14 / 12.66667 / 1.154701
C / 16 / 24 / 20 / 20 / 4
D / 25 / 20 / 16 / 20.33333 / 4.50925
E / 8 / 8 / 18 / 11.33333 / 5.773503
Consider the mean and standard deviation
Company / Mean / SDA / 18 / 2
B / 12.66667 / 1.154701
C / 20 / 4
D / 20.33333 / 4.50925
E / 11.33333 / 5.773503
A critical look at the figures reveals that company D has the highest mean of 20.333 followed closely by company C with a mean 20. However standard deviation indicates that company C is less risky or there is less variability is profit compared (4) compared to D (4.5). As a basic finance concept posits that where there is higher risk there is higher return’. This may be the reason for this scenario.
Company A is not doing badly at all. An average of 18 with a standard deviation of 2 shows much lesser variability than the C and D. Comparatively E is not doing very well. With an average of 11.33 it has the highest standard deviation of 5.77. Company B on the other hand is for investors who are scared of risk. It is the company with the least variability in profits. It is however performing better than company E.Recommendation: A, C, D and B or B only if the investor fears risk
1 | Page