# Introductory Statistics 161.120 and Introductory Biostatistics 161.130

Introductory Statistics 161.120 and Introductory Biostatistics 161.130

Assignment 2

You are expected to use Minitab for your analyses in Q4 and 5 with the output incorporated into your text. Although for preference your answers to this assignment should be computer produced there will be no penalty if it is neatly hand written.

(a)Find a magazine or newspaper article, from the last 6 months, that contains a "Bad" graphical, pictorial or tabular display of data. It must be described/discussed in the text of the article.

A copy of the article must be attached to your project

You must state where and when it was published. [8 marks]

(b) Discuss how the display would potentially confuseor mislead the casual reader. [5 Marks]

(c) Describe how you would improve the display, if you were to redraw it. [2 Marks]

Three examples of bad graphs can be found on Stream. However you should find your own example of a bad graph. Use of one of the example graphs will result in fewer marks being awarded (up to 0+4+2 possible marks).

Q 2. Expected values and standard deviations for a distribution[10 marks]

A driver has to get a warrant of fitness for his car. As a result, he may have to replace one or more tyres on the vehicle. Let X be the number of tyres needing replaced, and suppose the probability distribution of X is given by

X / 0 / 1 / 2 / 3 / 4
P(x) / 0.3 / 0.2 / 0.3 / 0.1 / 0.1

(a)What is the expected value and standard deviation of X? [ 7 marks ]

(b)Suppose new tyres cost \$169 each. What is the mean and standard deviation of Y, the total cost he may have to spend on tyres? [ 3 marks ]

Q 3. Experimental Design, with Blocking [22 Marks]

The owner of an apricot orchard in Hawke’s Bay currently uses a standard fertiliser to fertilise her apricot trees. However she has become aware of 3 new fertilisers (NewA, NewB and NewC) each of which claim to increase the yield of a mature apricot tree compared to the standard fertiliser.

To test this claim she consults a statistician (you) to help her design an experiment which she will run in the testing section of her orchard.

The testing section is 35metres by 35metres, flat and evenly irrigated. However, the South-West and South-East sides are lined with 6 metre tall hedging (see drawing on next page) which will affect the amount of sunlight each tree is exposed to over a growing season and hence the yield of the tree. The test section contains 36 mature apricot trees planted 5 metres apart.

Use the drawing on page 4 and the given information to design an experiment to determine which, if any, of the new fertilisers results in a higher yield than the standard fertiliser.

You will need to describe in detail:

(a) How you will use blocking in your experiment design. [5 Marks]

(b) How you will use randomisation in your experimental design. [5 Marks]

(c) How you will use replication in you experimental design. [5 Marks]

(d) What your control will be. [2 Mark]

(e) The process you will use to allocate the treatments to the trees. [5 Marks]

You should attach a copy of the diagram on the next page to your project indicating the blocking structure you would recommend to the owner of the orchard.

Your answer should be no more than 2 pages long, including the diagram.

Q4: Constructing and Interpreting Confidence Intervals [25 Marks]

In this question you will use the data you collected in part one of Assignment 1.

(a) Using the formula and hand calculator, construct and interpret a 95% Confidence Interval for the mean area of leaves for your tree (ignore the side and stratum variables). You must show working, including the formula and the t multiplier. [5 Marks]

(b)Use Minitab to construct a 95% Confidence Interval for the ratio of length to width. (ignore the side and stratum variables) [4 Marks]

(c)Many natural phenomena turn out to satisfy a property called the golden ratio  1.618. Is there enough evidence to conclude that the leaves of your tree do not satisfy this ratio, on average? Explain how you know. [3 Marks]

(d) Use Minitab to construct and interpret a 95% Confidence Interval for the area of leaves, for each combination of (side of the tree, and stratum). [5 Marks]

(e) Using the Confidence Intervals from (d), Which levels of your categorical variable (if any) appear to be different to others ? (How can you tell that they are different?) [3 Marks]

(f) What assumptions did you make when you constructed your confidence intervals? [ 3Marks]

(g) What is your overall conclusion? [2 Marks]

Q5. Transformations and Confidence Intervals [18 Marks]

In the summer of 2004/5 the Australian Research vessel “MV Aurora Australis” took samples of seawater from the Southern Ocean close, to the northern limit of seaice. High performance liquid chromatography (HPLC) was then used to the estimate the concentration of different taxa of phytoplankton, based on the concentrations of different pigments (colours) in the water. The excel file “phyto.xls” shows the concentration of Chlorophyll-A, the most common pigment, for samples up to 70m deep.

(a) Construct and attach to your assignment a histogram of the chlorophyll-A distribution.

Comment of the shape of the distribution. [5 Marks]

(b) Discuss why or why not it would be suitable to construct a 95% confidence interval using the data as it is. [3 Marks]

(c)Transform the distribution of Chlorophyll-A using the natural log transformation. Attach to your assignment a histogram of the transformed distribution times. Does this transformation makethe data appear approximately Normally Distributed? [ 5 marks]

(d)Irrespective of your answer in (c) construct a 95% confidence interval for the mean of the transformed data. [3 marks]

(e)Convert your 95% confidence interval back to the original units.(Your answer should be close to the median Chlorophyll-A level). [3 marks]

+ + + + + + + +

1