STAT 519 Multivariate Analysis
Instructor: Stephen Lee. Office at Brink 412 with phone: 5-7701. E-mail: stevel at uidaho dot edu
Course Homepage: http://www.webpages.uidaho.edu/~stevel/Stat519.EO.html
Course Objectives: The objective of this class is to give you an introductory working knowledge of multivariate data analysis so that you can understand the literature and be able to appropriately analyze many types of multivariate data. The emphasis will be on developing a sound understanding of the methods and of when they should and should not be employed.
Prerequisites: Stat 401 or equivalent coursework. In exceptional cases, the prerequisites may be waived, but you are expected to have a good background in general statistics, regression, and statistical computing. If you are concerned about your preparation, please come see me.
Text:
1. Analyzing Multivariate Data, by J. Lattin, J.D. Carroll, and P.E. Green, Duxbury, 2003.
2. An R and S-PLUS Companion to Multivariate Analysis, by Brian Everitt, Springer, 2007. (datasets and R codes used in the book is available at http://biostatistics.iop.kcl.ac.uk/publications/everitt/
3. Applied Multivariate Analysis Notes from U of Canbridge (a free e-version of the text is available at the course homepage)
4. Statistical Data Mining by Wiesner Vos & Ludger Evers. (a free e-version of the text is available at the course homepage)
Intended Course Coverage:
· Introduction to R
· R Graphics
· Vector and Matrix Geometry
· Principal Component Analysis
· Factor Analysis
[Mid-term exam]
· Multivariate Normal Distribution
· MANOVA
· Discriminant Analysis
· Distance Measures
· Hierarchical Clustering
· K-Means Clustering
· Multidimensional Scaling
· Classification Trees (time permitting)
· Random Forests (time permitting)
[Final exam]
Homework: Submit your homework electronically to with a clear subject heading “EO Stat 519”. Turn in your homework on a regular timely basis after you have finished watching the related DVD lectures. I am expecting to grade your homework according to the corresponding timelines of these lectures.
Please make things easy on me and yourself; make your homeworks easy to read and grade. Please type whenever you could and show your work. I will not grade a paper which I can't read.
Many of the problems will involve computer work (see below). For the computational portions of such problems, I will want to see the commands and output of the R computer package at the right spots.
Finally, you are writing your homework paper by yourself independently, even if you discuss with your peers. Similar homeworks suggest plagiarism which will result in serious consequences according to the UI policy.
Computer Use: We will use the R computer package. R is a Gnu-license (freeware) clone of the S-Plus package, and is available for free download (Windows and Unix) from http://cran.us.r-project.org/ I will spend time in class going over the use of R.
Grades: For each person I will compute an overall score according to the formula: 50% Homework + 20% Mid-term + 30% Final and will assign grades according to the traditional cutoff (i.e., 100-90, A; 80-89, B; …) Curving may or may not be considered when all homework and final grades are recorded, thus you are not going to rely on it for your grade! If there is any curving at all, it will probably be in the form of dropping your worst homework.
Exams Dates:
· Mid-term Exam a take home (no proctor needed) exam during the middle of the semester after finish Factor Analysis
· Final Exam a take home (no proctor needed) exam at the end of the semester
Exams Formats: They are open book, open everything and you can use your laptop. Mid-term and Final exam are comprehensive and cumulative, i.e., they cover all the materials from day 1.