KTH Mathematics  

Regression analysis SF2930
Course content and objectives:

The present course offers an introduction to regression modeling methods with applications. The presentation begins with simple and multiple linear regression models for which fitting, parametric and model inference as well as prediction will be explained. A special attention will be paid to the diagnostic strategies which are key components of good model fitting. Further topics include transformations and weightings to correct model inadequacies, the multicollinearity issue and shrinkage regression methods, variable selection and model building techniques. Later in the course, some general strategies for regression modeling will be presented with a particular focus on the generalized linear models (GLM) using the examples with binary and count response variables.

To illustrate the influence of electronic computation on regression analysis theory and practice, a number of aspects of computer usage is integrated into the course based on the statistical software package R.

The overall goal of the course is twofold: to acquaint students with the statistical methodology of the regression modeling and to develop advanced practical skills that are necessary for applying regression analysis to a real-world data analysis problem. The course is lectured and examined in English.

Recommended prerequisites:

  • SF 1901 or equivalent course of the type 'a first course in probability and statistics'.
  • Multivariate normal distribution.
  • Basic differential and integral calculus, basic linear algebra.
Guest lecturers from If :

Course literature and supplementary reading :

  • D. Montgomery, E. Peck, G. Vining: Introduction to Linear Regression Analysis. Wiley-Interscience, 5th Edition (2012). ISBN-10: 978-0-470-54281-1. 645 pages. Acronym below: MPV.
The textbook MPV can be bought at THS Kårbokhandel, Drottning Kristinas väg 15-19. There is a number of other books that cover the topics of the course. Here are some recommendations
  • G. James, D. Witten, T. Hastie, R. Tibshirani: An introduction to Statistical Learning.Web page for the book by the publisher Springer .
  • A. J. Izenman: Modern Multivariate Statistical Techniques. Regression, Classification, and Manifold Learning.Web page for the book by the publisher Springer .
  • J.O Rawlings, S.G Pantula, D.A Dickey: Applied Regression Analysis - A Research Tool, Springer, 2ed Edition. Freely available as ebook link .

Preliminary plan of lectures and exercises sessions.

  • Lecturers (in alphabetic order) AH=Alexandre Chotard, TK=Timo Koski, (guest lecturers from KTH), TP=Tatjana Pavlenko, FR= Felix Rios. Guest lecturers from If: Guest(If). The addresses of the lecture halls and guiding instructions are found by clicking on the Hall links below
  • Problems to be solved during the exercise sessions and recommended exercises to be solved on your own are found here.

Day Date Time Hall Topic Lecturer
1. Wed 18/01 13-15 E1 Lecture 1: Introduction (the course work and computer projects). Introduction to regression modeling. Simple linear regression: model fitting and inference. Chapter 2 in MPV.
2.Fri 20/01 10-12 M1 Lecture 2: Simple linear regression: inference and prediction. Chapter 2 in MPV.
3. Mon
8-10 D1 Exercise 1: Simple regression. Problem solving at the board and applications with R.
4. Thu
26/01 15-17 F2 Lecture 3: Multiple linear regression: matrix notations, model fitting and properties of the estimates. Chapter 3 in MPV.
5. Fri
27/01 10-12 F2 Lecture 4: Multiple linear regression: inference and prediction. Chapter 3 in MPV. Project I handout. TP
6. Mon
30/01 08-10 E1 Exercise 2: Multiple regression. Problem solving at the board and applications with R.

7. Tue
31/01 10-12 M1 Lecture 5: Model adequacy checking. Residual analysis. Chapter 4 in MPV. TP
8. Thu
2/02 10-12 F2 Lecture 6: Model adequacy checking (cont.). Transformations to correct model model inadequacies. Chapters 4-5 in MPV.
9. Fri
3/02 08-10 M1 Exercise 3: Model adequacy checking, theoretical exercises and applications with R. FR
10. Mon
6/02 08-10 E1
Lecture 7: Methods for detecting influential observations: leverage and measures of influence. Chapter 6 in MPV.
11. Tue
07/02 15-17 E1 Lecture 8: Multicollinearity: sources and effects. Chapter 9 in MPV. TP
12. Fri
10/02 10-12 D1 Exercise 4: Diagnostic for leverage, influence and multicollinearity. Chapter 6 and 9 in MPV. Model diagnostics with R.
13. Mon
13/02 08-10 E1 Lecture 9: Methods for dealing with multicollinearity. Model respecification: ridge and PCA regression. Chapter 9 in MPV.
14. Tue
14/02 15-17 F2 Lecture 10: Variable selection and model building. Chapter 10 in MPV.
15. Wed
15/02 13-15 M1
Exercise 5: Multicollinearity (Ridge regression, principal component regression (PCR)), Ch. 10: Variable selection and model building with R. FR
16. Fri
17/02 13-15 E1 Lecture 11: Variable selection and model building (cont.). Chapter 10 in MPV. Bootstrapping in regression. Chapter 15.4 in MPV. TP
17. Mon
20/02 08-10 D1
Lecture 12: Relation to other methods of multivariate statistical analysis: Regression and Classification, CART.
18. Wed
22/02 13-15 F2
Lecture 13: Models with a binary response variable. Introduction to logistic regression.
19. Fri
24/02 10-12 M1
Lecture 14: Generalized Linear Models (GLM) and exponential families. GLM modelling of binary response variables using logit-link functions. Project II handout.
20. Mon
27/02 13-15 M1
Exercise 6: GLM-modeling of Poisson regression. Hypotheses testing and model validation: Likelihood ratio test, Deviance and Wald test.
21. Wed
1/03 Obs! 10.00-12.00 4V3Ora
Exercise 7: GLM-modeling with R.
22. Fre
3/03 13-15 D1
Lecture 15: Discussion on the project II results. If presentation.
23. Mon
6/03 08-10 Q1
Lecture 16: Repetition/Reserve.
14/03 08-13 L21 m.m. Exam TP
8/06 08-13 TBA m.m. Re-exam TP