ISDS 361A
Class Project
(10 Points in the Course)
DUE: MW CLASS – M May 11; TR CLASS – T May 12
Our discussion of regression has included topics:
SIMPLE LINEAR REGRESSION
1. Generating a linear regression equation
2. Testing if you can conclude a linear relation exists between y and x
3. Getting a confidence interval for the slope (β1) -- and what that means
4. Determining r2 – and what that means
5. Predicting the value of y given a specific x value (and generating a 95%
PREDICTION INTERVAL)
6. Estimating the average value of y given a specific x value (and generating a 95%
CONFIDENCE INTERVAL)
7. Knowing (stating) the assumptions behind regression analyses
MULTIPLE REGRESSION
1. Determining a linear regression equation
2. Testing if you an conclude any of the independent variables (the x’s) are
linearly related to the dependent variable (y) – F-test
3. Testing if you can conclude a specific independent variable (x) is linearly
related to the dependent variable (y) when the other independent
variables are held constant – the t-tests
4. Determining r2 – and what that means
5. Predicting the value of y given specific values for the x’s
6. Using dummy variables
7. Knowing (stating) the assumptions behind regression analyses
OTHER MULTIPLE REGESSION TOPICS – NOT COVERED IN THIS PROJECT
1. Estimating the average value of y given specific values for the x’s
2. Polynomial Models
3. Stepwise Regression
4. Testing Parts of the Model
THE PROJECT
Select a stock from the New York Stock Exchange that begins with the first letter of your last name; i.e. if your last name was Michaels, you might choose MSFT (Microsoft), or MRK (Merck).
Record the daily CLOSING PRICEs for the first 58 trading days since President Obama took office (from JAN 20, 2009 to APRIL 13, 2009) of:
o y = your stock (the dependent viable)
o x1 = overall market represented by the Dow Jones Industrial Average (INDU)
o x2 = a precious metals index represented by PHLX Gold and Silver Index (^XAU)
o x3 = an oil measure represented by the iPath S&P GSCI Crude Oil Index (OIL)
o x4 = whether or not the day was a day after a trading holiday (1 = YES; 0 = NO)
o x5 = another factor (stock, mutual fund, index, etc.) you believe could
affect your stock’s price (explain in the report why you chose this one as a
possible relevant factor)
This data is readily available in Yahoo! Finance in spreadsheet form by going to GET QUOTES, then selecting “Historical Data” and putting the bullet in “Daily”. You can simply cut and paste this information into an Excel spreadsheet.
NOTE; Four x4 -- the only three trading holidays have been: Monday, January 19 (Martin Luther King Day), Monday, February 16 (President’s Day), and Friday, April 10 (Good Friday)
PART I
o Do a simple linear regression of your stock on any one (one only) of the first four (x1, x2, x3, or x4) independent variables. Choose one you think could most influence the value of your stock.
o Predict the value of your stock with both a point estimate and an interval estimate using: x1 = 8029.62 or x2 = 125.12 or x3 = 19.04 or x4 = 0 (these are the real values of these factors on April 15).
o Look up the value of your stock on April 15. Note how close it was to the point estimate and note if it fell within the interval estimate.
PART II
o Hypothesize a multiple linear regression model of the form:
y = β0 + β1x1 + β2x2 + β3x3 + β4x4 + β5x5 + ε
where x1, x2, x3, x4, x5 are the factors listed above.
o Predict the value of your stock with a point estimate using using x1 = 8029.62, x2 = 125.12, x3 = 19.04, and x4 = 0 (these are the real values of these factors on April 15) and the value of your x5 stock on April 15.
o Note how close the point estimate was to the actual value of your stock.
THE REPORT
After doing the appropriate statistical analyses, present the results in a short “business report” to the CEO (me).
· PAGE 1 -- The report should begin with a title page.
· PAGE 2 -- The next page should give a brief (two paragraph) description of the company under study (the dependent variable, y).
· Then give a brief (two paragraph) description of the company or fund, etc. that you chose for x5, and state why you thought it might be related to your stock, y.
· PAGES 3 and 4 – State what the report is attempting to do which is in PART I, trying to determine a linear relation between your stock, y, and one of the x’s (other than x5) – state why you chose this one; in PART II you are trying to see how well a model using all 5 variables predicts the value of your stock.
o Then print the Excel table of the data for your y and the five x values for all the trading days between
· The report now consists of two parts
· For Part I, beginning on page 5:
o Answer the 7 questions dealing with simple linear regression on the first page of this project description, but do it in the context of a flowing “business report”.
o State whether or not you believe your simple linear regression model is a good model – Note that a good model is not one that has a close prediction for this one day, it is one that has a good r2 and a low p-value for the β1 test – explain what this means in the report.
o Do, however, compare the true value of y on April 15 to the predicted value and comment and also note whether or not it fell within the confidence interval.
o Also state whether or not your stock appears to be proportional to your x value (meaning it goes up when the x value goes up) or inversely proportional (meaning it goes down as your x value goes up) – this is easily found by looking at the sign of the slope of your line, b1.
· For Part II,
o Answer the 7 questions dealing with multiple linear regression on the first page , but do it in the context of a flowing “business report”, stating what they mean
· In the appendix,
o Include your two Excel regression outputs (one for PART I and one for PART II) that justifies your report conclusions.
The report should be neat, grammatically correct, have correct punctuation and spelling and it should “flow”. Place the report in a binder and turn in.