Stat 421, Fall 2008

Fritz Scholz

Homework 1

Due Friday, October 3rd, 2008

Problem 1: The command colors()produces the names of all 657 colors in R available for plotting. Try out the command: plot(rep(1:10,2),rep(1:2,each=10),col=colors()[31:50],

ylim=c(0,3),pch=16,cex=1.5)

For col, pch and cex look under par in the html interface opened by help.start() on the R command line.Similarly examine the documentation on rep and see what you get when invoking rep(1:10,2) andrep(1:2,each=10)on the R command line,respectively.The exercise below is intended to give you a view of the first 650 (=2526) colors by plotting points in 55 arrays.

Write a function that plots a solid dot for each of the first 25 colors. Arrange the dots in a55array with positions (1,1),(2,1), …,(5,1),(1,2),…,(5,2),….,(1,5),…,(5,5). In order to do these plots it will be necessary to create the position vectors x andy of length 25, x being a repeat of 1,2,3,4,55 times and y being 1,1,1,1,1 , followed by 2,2,2,2,2 …. followed by5,5,5,5,5. Try to use the rep function as illustrated in the above example. Using the text command place the respective color names as given by colors()above eachdot.Use the same x,y grid with slightly increased y-values, i.e., text(x,y+.3,colors()[1:25]), for positioning the text vector of color names.

Do the above for plottingthe first 25 colors only, until you get a satisfactory result.

After that, rather than writing down the plot command 26 times (appropriately modified) try to do it in a for-loop. The grid coordinate vectors x and y stay the same, but the color indexing has to shift to the next block of 25 colors, e.g., colors()[(i-1)*25+1:25] gives you the i-th group of 25 colors.

After each such plot invoke readline(“hit return\n”). This stops further function processing and allows you to save the produced plot as a file or on the clipboard for inclusion in a Word or other amenable document (highlighting the graphics window, click File and choose Save as or Copy to the clipboard). Once you are done with that hit return and the function processing continues.

Give the code of your function and the 21st plot.

To make it easy to count the plots, add the optionmain=paste("Plot",i)to your plot command, where i is the for-loop counting parameter. If the color of the dot in position (4,2) has label orchid1, you should be on the right track.

Problem 2:Write a function sample.plot with arguments as shown below

sample.plot=function(n=100,Nsim=10000,outlier=1,nbin=100){

you fill in the rest

}

that does the following:

It creates a vector y of length n, consisting of the integers 1,2,3, …, n.

Then it replaces y[1] by the value given by outlier.

It initializes a vector z=NULL.

A: Using the function sample (see documentation) it samples from ywith replacement a vector x of length n and computes its average xbar (using the function mean).

Then it updates the z vector by concatenating xbar to it, i.e., z=c(z,xbar).

It repeats the process A in a for-loop Nsim times and thus creates a vector z of Nsim averages. It then plots the histogram of z, using hist(z,nclass=nbin).

Provide the function that you construct and the two plots when you use outlier=1 and outlier=2000, respectively.

Explain the nature of the first plot, why is it that way? Recall an important theorem from

Probability theory.

What are the similarities and differences between the first plot and the second plot?

Try to explain the nature of the differences. How do they come about? Think in terms of the number of outliers that you might see in the respective samples you create and also in terms of what you see in the first plot.

Problem 3:Write a function

LLN=function (p=.3,N=100000){

For you to fill in

}

that illustrates the law of large numbers (LLN). For that purpose generate a random sequence x of 0’s and 1’s with probability 1-p and p, respectively, using the function call x=rbinom(N,1,p), see documentation. Compute the sequence (the vector phat) of proportions of 1’s as the number of trials progresses all the way to Nby making use of the function cumsum(x) divided by the appropriate denominator vector. Then plot this N-vector phatagainst 1:N and add a horizontal line at level pby using the abline function. Note: when you divide one vector of length n by another vector of length n you get an n-vector composed of the coordinate-wise quotients, e.g., c(1,2,4)/c(1,2,3)=c(1,1,4/3).

Provide your code and 2 plot examples.