London School of Commerce

Quantitative Techniques in Business

(QTB)

Lecture 1:

Descriptive Statistics

By:

Dr. David Acquaye

Contents

1.Mode:

1.1Definition and Illustration

1.2Estimating the mode from a grouped data

2.Mean

2.1Arithmetic sample mean:

2.2Geometric mean

2.3Harmonic mean

2.4Arithmetic Weighted mean

2.5Try these:

3.Standard Deviation and Variance

3.1Sample Standard Deviation ( :

3.2Population Standard Deviation ( :

3.3Examples on standard deviation

3.4Try these:

4.Median

4.1Median of ungrouped data

4.2Median for grouped data

5.Quartiles

5.1Lower Quartile

5.2Upper Quartile:

5.3Inter Quartile Range (QR) and Semi-Inter Quartile Range

5.4Example :

6.Graphs

1.Mode:

1.1Definition and Illustration

It is the item/object/group/ measurement etc. with the highest frequency.

For example:

2 ; 4 ; 6 ; 10 ; 2 ; 5 ; 2 ; 3 ; 2 ; 6

From the above set of numbers, the mode is 2 because it has the highest frequency (appears 4 times). Where two numbers appears more times we have a bimodal situation:

6 ; 2 ; 4 ; 6 ; 10 ; 2 ; 5 ; 2 ; 3 ; 2 ; 6 ; 6

There are two modes from the above set of numbers (2 and 6: they both appear 4 times).

1.2Estimating the mode from a grouped data

= Lower class boundary of the modal class

= The difference between the modal frequency and the previous class’ frequency (fm-fm-1)

= The difference between the modal frequency and the next class’ frequency (fm-fm+1)

w = Class size of the modal class

Examples:

1.The data below shows the marks obtained by some students in an examination. You are required to estimate the modal mark.

Mark(%) / Number of students
0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5

2.The data shows the age distribution of students in a class. You are required to estimate the modal age of the class.

Age (years) / Frequency
10 – 14 / 6
15 – 19 / 12
20 – 24 / 25
25 – 30 / 8
30 - 34 / 2

Note that the age distribution needs to be changed into class boundaries before the calculations could be done (subtract 0.5 from lower class boundaries and add 0.5 to upper class boundaries: eg: 9.5 – 14.5 ; 14.5 – 19.5 ; 19.5 – 24.5 etc.)

2.Mean

This is simply the average. There are several types of means: these include – arithmetic sample mean; geometric mean; harmonic mean and weighted mean

2.1Arithmetic sample mean:

Given the range

Example

Find the arithmetic mean of the following set of numbers

2 ; 4 ; 6 ; 10 ; 2 ; 5 ; 2 ; 3 ; 2 ; 6

2.2Geometric mean

Given the range

Example

Find the geometric mean of the following set of numbers:

2 ; 3 ; 4 ; 5 ; 6

This is often used when the type of data relates to percentage changes such : population changes; birth rates; death rates; price changes; share price changes etc.

2.3Harmonic mean

Example:

Find the harmonic mean of the following set of numbers

12 ; 3 ; 4 ; 5 ; 86; 4 ; 7 ; 8 ; 2

Harmonic Mean is often used when there are outliers in the data

2.4Arithmetic Weighted mean

This is suitable when the data is grouped. It is illustrated below:

The data below shows the marks obtained by some students in an examination. You are required to estimate the modal mark. (see example 1)

Mark(%) / Number of students
0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5

The weighted mean is estimated as follows

Mark(%) / f / x / fx
0-10 / 8 / 5 / 40
10-20 / 11 / 15 / 165
20-30 / 13 / 25 / 325
30-40 / 16 / 35 / 560
40-50 / 5 / 45 / 225
50-60 / 17 / 55 / 935
60-70 / 10 / 65 / 650
70-80 / 6 / 75 / 450
80-90 / 9 / 85 / 765
90-100 / 5 / 95 / 475
100 / 4590

2.5Try these:

1.Find the mean of the age distribution below:

Age (years) / Number of students
0 – 4 / 0
5 – 9 / 0
10 – 14 / 0
15 - 19 / 3
20 – 24 / 28
25 – 29 / 14
30 – 34 / 11
35 – 40 / 4

[Note that the most suitable formula to use is the weighted mean because its grouped data]

2.Find the harmonic mean of the following set of data

i.200 ; 12 ; 4 , 16 , 11 , 14 , 10 , 2 ; 15

ii.100 ; 85 ; 92 ; 2 ; 86 ; 105 ; 89

3.Find the geometric mean of the following set of data:

i.20 ; 15 ; 11 , 16 and 19

ii.1.52 ; 1.83 ; 1.74 ; 1.3 ; 0.89 and 0.94

4.Using the data in Question 1 above; estimate the modal age.

3.Standard Deviation and Variance

The standard deviation tells us how true the mean is. The bigger the standard deviation the more untrue our mean is. The implication is that the mean will not be a good estimator. The standard deviation technically reveals the dispersion and variability in the data.

In business its often used as a measure of risk especially in share prices. It also used for the measurement of volatility of the financial market.

3.1Sample Standard Deviation ( :

[Sample Standard deviation]

3.2Population Standard Deviation ( :

3.3Examples on standard deviation

The data below show the weekly take home pay of employees in a suburb in London.

Weekly
take- home- pay (£’000) / Number of workers
100 – 110 / 3
110 – 120 / 7
120 – 130 / 12
130 – 140 / 6
140 – 150 / 8
150 – 160 / 20
160 – 170 / 11
170 – 180 / 7
180 – 190 / 4
190 - 200 / 2

You are required to estimate the standard deviation of pay distribution above.

Suggested format of Answer :

Take-home (000) / f / x / fx / fx2
100 – 110 / 3 / 105 / 315 / 33075
110 – 120 / 7 / 115 / 805 / 92575
120 – 130 / 12 / 125 / 1500 / 187500
130 – 140 / 6 / 135 / 810 / 109350
140 – 150 / 8 / 145 / 1160 / 168200
150 – 160 / 20 / 155 / 3100 / 480500
160 – 170 / 11 / 165 / 1815 / 299475
170 – 180 / 7 / 175 / 1225 / 214375
180 – 190 / 4 / 185 / 740 / 136900
190 - 100 / 2 / 195 / 390 / 76050
80 / 11860 / 1798000

2.The data below shows the marks obtained by some students in an examination. You are required to estimate the standard deviation

Mark(%) / Number of students
0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5

Suggested format of answer:

Mark(%) / f / x / fx / fx2
0-10 / 8 / 5 / 40 / 200
10-20 / 11 / 15 / 165 / 2475
20-30 / 13 / 25 / 325 / 8125
30-40 / 16 / 35 / 560 / 19600
40-50 / 5 / 45 / 225 / 10125
50-60 / 17 / 55 / 935 / 51425
60-70 / 10 / 65 / 650 / 42250
70-80 / 6 / 75 / 450 / 33750
80-90 / 9 / 85 / 765 / 65025
90-100 / 5 / 95 / 475 / 45125
100 / 4590 / 278100

The standard deviation is calculated as follows:

The Variance is calculated as follows:

There is a high degree of variability in the distribution.

3.4Try these:

1.The data below shows the sales recorded by different marketing agents of a firm.

Sales range (‘000) / Frequency
5-10 / 10
10 - 15 / 14
15 – 20 / 8
20 – 25 / 18
25 – 30 / 20
30 – 35 / 8
35 – 40 / 11
40 – 45 / 9
45 – 50 / 2

You are required to estimate the

(a)Mode

(b)Mean

(c)Standard deviation

2.The data shows the age distribution of students in a class. You are required to estimate the modal age of the class [See question 2 of mode]

Age (years) / frequency
10 – 14 / 6
15 – 19 / 12
20 – 24 / 25
25 – 30 / 8
30 - 34 / 2

You are required to calculate

(a)Mean

(b)Standard deviation

4.Median

It is the middle item when data is arranged in order of magnitude (smallest to the biggest or the vice-versa).

4.1Median of ungrouped data

Example 1: Find the median of the following set of numbers:

4 ; 2 ; 6 ; 5 ; 8 ; 1 ; 3

To find the median, the numbers need to be rearranged in order of magnitude:

1 ; 2 ; 3 ; 4 ; 5 ; 6 ; 8

The median is 4 because it is the number in the middle

Example 2 : Find the median of the following set of numbers:

5 ; 2 ; 6 ; 4 ; 8 ; 3 ; 7 ; 1

Re-arranging:

1 ; 2 ; 3 ; 4 ; 5 ; 6 ; 7 ; 8

The median is (4+5)/2 = 4.5

[this because there are two numbers in the middle]

4.2Median for grouped data

Lm = Lower class boundary of the median class

N = Total Frequency of distribution

= Summation of frequencies before the median class

Fm= Frequency of the median class

w = the size or width of the median class

Let’s use some of the above examples to estimate the median:

Mark(%) / Number of students
0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5

The median is calculated as follows:

5.Quartiles

5.1Lower Quartile

= Lower class boundary of the Lower Quartile Class

= Total Frequency

= The sum of all frequencies before the lower quartile class

FQ1 = Frequency of the lower quartile class

w = Class size of the lower quartile class

5.2Upper Quartile:

= Lower class boundary of the Upper Quartile Class

= Total Frequency

= The sum of all frequencies before the upper quartile class

fQ3 = Frequency of the Upper quartile class

w = Class size of the upper quartile class

5.3Inter Quartile Range (QR) and Semi-Inter Quartile Range

1.Inter Quartile RangeQ3 – Q1

2.Semi-Inter Quartile Range/Quartile Deviation =

5.4Example :

Again let’s use some of the above examples to calculate the upper and lower quartile

Mark(%) / Number of students
0-10 / 8
10-20 / 11
20-30 / 13
30-40 / 16
40-50 / 5
50-60 / 17
60-70 / 10
70-80 / 6
80-90 / 9
90-100 / 5

Lower Quartile:

Upper Quartile:

Quartile Deviation:

6.Graphs

Example 1

The data below shows the profitability ratios of five companies (A,B,C,D and E) from 2008-2010. These companies operate in the same sector.

Profitability Ratios from 2008-2010
Company / 2008 / 2009 / 2010
A / 18 / 20 / 16
B / 12 / 12 / 14
C / 16 / 24 / 20
D / 25 / 20 / 16
E / 8 / 8 / 18

You are required to :

  1. Represent the above data with an appropriate graph
  2. Calculate the mean and standard deviation across the years and recommend one company for an investor :

Suggested Answer:

There so many graphs that can be used to represent the above information. These may include the following:

A divided or stacked column graph can also be used to represent the above data as below:

A line graph can also be used to represent the data as shown below:

Example 2

The data below shows the distribution of the labour force in a company according to occupational categories for 1990 and 1995.

Category / 1990 (%) / 1995(%)
Administration / 10 / 15
Professional and Technical / 12 / 12
Skilled Manual / 24 / 24
Unskilled Manual / 40 / 24
Clerical / 14 / 25
Total (%) / 100 / 100

Represent the above information by means of pie charts for each year and comment on the distribution.

Example 3 :Mean and standard deviation examined

Profitability Ratios from 2008-2010
Company / 2008 / 2009 / 2010 / Mean / SD
A / 18 / 20 / 16 / 18 / 2
B / 12 / 12 / 14 / 12.66667 / 1.154701
C / 16 / 24 / 20 / 20 / 4
D / 25 / 20 / 16 / 20.33333 / 4.50925
E / 8 / 8 / 18 / 11.33333 / 5.773503

Consider the mean and standard deviation

Company / Mean / SD
A / 18 / 2
B / 12.66667 / 1.154701
C / 20 / 4
D / 20.33333 / 4.50925
E / 11.33333 / 5.773503

A critical look at the figures reveals that company D has the highest mean of 20.333 followed closely by company C with a mean 20. However standard deviation indicates that company C is less risky or there is less variability is profit compared (4) compared to D (4.5). As a basic finance concept posits that where there is higher risk there is higher return’. This may be the reason for this scenario.

Company A is not doing badly at all. An average of 18 with a standard deviation of 2 shows much lesser variability than the C and D. Comparatively E is not doing very well. With an average of 11.33 it has the highest standard deviation of 5.77. Company B on the other hand is for investors who are scared of risk. It is the company with the least variability in profits. It is however performing better than company E.Recommendation: A, C, D and B or B only if the investor fears risk

1 | Page