MATH 5305 Statistical Models

Instructor

Dr. Jesse Crawford Office phone: (254) 968-9536

Email: Office: Math 332

Website: faculty.tarleton.edu/crawford

Office Hours

MW 1:00 – 2:00

TR 2:30 – 3:30

You are highly encouraged to visit my office for help.

Course Meeting Times

MW 5:15 – 6:30 in Math 213

Required Materials

Statistical Models: Theory and Practice, revised edition, by David Freedman.

Optional Materials

Applied Linear Statistical Models, by Kutner et al.

Applied Logistic Regression, 2nd ed., by Hosmer and Lemeshow.

Multivariate Data Reduction and Discrimination with SAS Software, by Khattree and Naik.

Homework

Homework will be assigned almost every class meeting and will be due a week later. It is crucial to keep up with the homework to succeed in this course.

Grades

Course averages will be computed as follows.

Assignment / % of Grade
Homework / 40%
Midterm Exam / 30%
Final Project / 30%

Students with Disabilities: It is the policy of Tarleton State University to comply with the Americans with Disabilities Act and other applicable laws. If you are a student with a disability seeking accommodations for this course, please contact Trina Geye, Director of Student Disability Services, at 254.968.9400 or . Student Disability Services is located in Math 201. More information can be found at or in the University Catalog.

Academic Integrity: The Tarleton University Mathematics Department takes academic integrity very seriously. The usual penalty for a student caught cheating includes an F in the course. Further penalties may be imposed, including expulsion from the university.

Student Learning Outcomes

Knowledge outcomes: Students will demonstrate knowledge of the following topics

a)Basics of experimental design, such as the distinction between observational studies and experiments, randomization, blinding, and confounding variables.

b)The mathematical assumptions of statistical models, such as simple and multivariate linear regression models and logistic regression models.

c)Techniques of estimation and hypothesis testing for these models, including ordinary least squares, generalized least squares, maximum likelihood estimation, t-tests, F-tests, and likelihood ratio tests.

Skill outcomes: Students will demonstrate proficiency in the following skills

d)Using software to fit statistical models to real data sets and make predictions.

e)Assessing the appropriateness of models with diagnostics, such as the Shapiro-Wilk test, Brown-Forsythe test, Durbin-Watson test, and various residual plots.

f)Addressing problems with models using remedial measures such as Box-Cox transformations and generalized least squares.

g)Analyzing empirical papers that use statistical models.

Sections of Primary Interest

Statistical Models: Theory and Practice, by David Freedman

1 Observational Studies and Experiments

1.1 Introduction

1.2 The HIP trial

1.3 Snow on cholera

1.4 Yule on the causes of poverty

Exercise set A

1.5 End notes

2 The Regression Line

2.1 Introduction

2.2 The regression line

2.3 Hooke’s law

Exercise set A

2.4 Complexities

2.5 Simple vs multiple regression

Exercise set B

2.6 End notes

3 Matrix Algebra

3.1 Introduction

Exercise set A

3.2 Determinants and inverses

Exercise set B

3.3 Random vectors

Exercise set C

3.4 Positive definite matrices

Exercise set D

3.5 The normal distribution

Exercise set E

3.6 If you want a book on matrix algebra

4 Multiple Regression

4.1 Introduction

Exercise set A

4.2 Standard errors

Things we don’t need

Exercise set B

4.3 Explained variance in multiple regression

Association or causation?

Exercise set C

4.4 What happens to OLS if the assumptions break down?

4.5 Discussion questions

4.6 End notes

5 Multiple Regression: Special Topics

5.1 Introduction

5.2 OLS is BLUE

Exercise set A

5.3 Generalized least squares

Exercise set B

5.4 Examples on GLS

Exercise set C

5.5 What happens to GLS if the assumptions break down?

5.6 Normal theory

Statistical significance

Exercise set D

5.7 The F-test

“The” F-test in applied work

Exercise set E

5.8 Data snooping

Exercise set F

5.9 Discussion questions

5.10 End notes

Applied Linear Statistical Models, 5th ed., by Kutner, Nachtsheim, Neter, and Li.

3Diagnostics and Remedial Measures

3.1 Diagnostics for Predictor Variable

3.2 Residuals

3.3 Diagnostics for Residuals

3.4 Overview of Tests Involving Residuals

3.5 Correlation Test for Normality

3.6 Tests for Constancy of Error Variance

3.7 F Test for Lack of Fit

3.8 Overview of Remedial Measures

3.9 Transformations

9 Building the Regression Model I: Model Selection and Validation

9.1 Overview of Model-Building Process

9.2 Surgical Unit Example

9.3 Criteria for Model Selection

9.4 Automatic Search Procedures for Model Selection

9.5 Some Final Comments on Automatic Model Selection Procedures

9.6 Model Validation

Applied Logistic Regression, 2nd ed., by Hosmer and Lemeshow.

2 Multiple Logistic Regression

2.1 Introduction

2.2 The Multiple Logistic Regression Model

2.3 Fitting the Multiple Logistic Regression Model

2.4 Testing for the Significance of the Model Coefficients

4 Model-Building Strategies and Methods for Logistic Regression

4.1 Introduction

4.2 Variable Selection

4.3 Stepwise Logistic Regression

4.4 Best Subsets Logistic Regression

Multivariate Data Reduction and Discrimination, by Khattree and Naik.

5 Discriminant Analysis

5.1 Introduction

5.2 Multivariate Normality

5.4 Discriminant Analysis: Fisher’s Approach

5.5 Discriminant Analysis for k Normal Populations