The aim of the course is to introduce some of the basic algorithms and
methods of statistical learning theory at an intermediate level. These are essential tools for making sense of the vast
and complex data sets (c.f. big data) that have emerged in fields ranging from biology to marketing to astrophysics in the past decades. The course presents some of the most important modeling and prediction techniques, along with some relevant applications. Topics presented include classification, artificial neural networks with exponential families of distributions, Bayesian learning, resampling methods, treebased methods, and clustering, highdimensional data.
This is a good part of the background required for a career in data analytics. The course is lectured and examined in English.
Recommended prerequisities:
 SF 1901 or equivalent course of the type 'a first course in probability and statistics (for engineers)'
 Multivariate normal distribution
 Basic differential and integral calculus, basic linear algebra.
 Proficiency in R (optional)
Lecturers:
Course literature::
 G. James, D. Witten, T. Hastie, R. Tibshirani: An introduction to Statistical Learning web page for the book (acronym below: ISL) by the publisher Springer
 some sections of: Avrim Blum, John Hopcroft and Ravindran Kannan: Foundations of Data Science pdf from the authors
 Supplementary reading and material from the lectures
web page
The textbook ISL can be bought at THS Kårbokhandel, Drottning Kristinas väg 1519.
Examination:
 Computer homework (3.0 cu): there are two compulsory computer projects/home work that are to be submitted as written reports. Each report should be
produced by a group of two (2) students. The reports are
examined at the Project presentation seminars on TBA of November and TBA of December, 2017. The computer homework will be graded with Pass/Fail.
 There will be a written exam (4.5 cu), consisting of five (5) assignments, on Thursday 11th of January, 2018, 08
13.00 hrs.
 Bonus for summaries of the guest lectures and papers
An individually written summary (max. 2xA4) of the scientific contents of
a guest lecture (2 x E.A), (LK) (SV) will provide one (1) bonus point for the exam. In addition can bonus points be gained by written summaries of at most two scientific articles (TBA). The summary is expected to be based on the students' own notes taken during the lecture or reading of a paper.
The summaries must be submitted with deadline Fri 16th of December at 15 hrs. The bonus points are valid for the ordinary Exam on Thursday 11th of January, 2018, and in the reexamination on (TBA). The maximum number of bonus points to be gained is five (5).
 Important: Students, who are admitted to a course and who intend to attend it, need to activate themselves in
Rapp . Log in there using your KTHid and click on "activate" (aktivera). The codename for sf2935 in Rapp is statin17.
Registration for the written examination via "mina sidor"/"my pages"
is required.
Grades are set according to the quality of the written examination.
Grades are given in the range AF, where A is the best and F means
failed.
Fx means that you have the right to a complementary examination
(to reach the grade E).
The criteria for Fx is a grade F on the exam, and that an isolated part
of the course can be
identified where you have shown a particular lack of
knowledge and that the examination after a complementary examination on
this
part can be given the grade E.

Supervision for computer projects
Teaching assistant Daniel Berglund will be available for advice and supervision for computer projects at times
to be announced.
Plan of lectures
KTH Social .
(TK=Timo Koski, JO= Jimmy Olsson TP=Tetyana Pavlenko, DB= Daniel Berglund, EA= Erik Aurell, LK= Lukas Käll, SV= Sara Väljamets, ISL =
the textbook, FoDSc= Foundations of Data Science )
The addresses of the lecture halls and guiding instructions are found by clicking on the Hall links below
Day 
Date 
Time 
Hall 
Topic 
Lecturer 
Tue 
31/10 
1315 
Q2

Lecture 1: Introduction to statistical learning (perceptrons, feedforward neural nets) and the course work.
Introduction to computer projects Chapter 2 in ISL.

TK 
Thu 
02/11

0810 
Q2 
Lecture 2:
Supervised Learning Part I. Chapter 4 in ISL

TP

Fri

03/11

1012 
Q2 
Lecture 3: Supervised Learning Part II. Chapter 4 in ISL

TP

Tue

07/11 
1416 
Q2 
Lecture 4: Bootstrap

TP

Thu

09/11 
0810 
Q2 
Lecture 5: Introduction to R in a computer class Chapter 2 in ISL 
DB

Fri

10/11 
1012 
Q2 
Lecture 6: feedforward neural networks as statistical models I, handouts. 
TK

Tue

14/11 
1315 
Q2 
Lecture 7: feedforward neural networks as statistical models II, Support vector machines (SVM) I Chapter 9 in IS

TK

Thu

16/11 
0810 
Q2 
Lecture 8: SVM II Chapter 9 in ISL

TK

Fri

17/11 
1012 
Q2 
Lecture 9: Bayesian Learning I, Handouts

TK

Tue

21/11 
1315 
D3

Lecture 10:Project presentation seminar 1

TK

Thu

23/11 
0810 
Q2 
Lecture 11:Bayesian Learning II Handouts

TK

Fri

24/11 
1012 
E3 
Lecture 12: Guest Lecture: TBA
 SV

Tue
 28/11 
1315 
E3 
Lecture 13: Unsupervised learning part I. Chapter 10 in ISL

TK

Thu

30/11 
0810 
Q2 
Lecture 14: Unsupervised learning part II. Chapter 10 in ISL

TK

Fri

01/12 
1012 
E3

Lecture 15: GUEST LECTURE: An insight into computational and statistical mass spectrometrybased proteomics 
LK

Tue

05/12 
1315 
E3 
Lecture 16: Random Trees and Classification. Chapter 8 in ISL

JO

Fri

06/12 
1517 
Q2

Lecture 17: Guest Lecture: Inferring protein structures from many protein sequences I

EA

Thu

07/12 
0810 
Q2 
Lecture 18: Geometry of HighDimensional Spaces, Gaussians in high Dimensions, Johnson Lindenstrauss Lemma, Separating Gaussians.Part I, Chap.2 in FoDSc. 
TK

Tue

12/12 
1315 
E3

Lecture 19: Guest Lecture: Inferring protein structures from many protein sequences II
 EA

Fri

14/12 
0810 
Q2

Lecture 20: Geometry of HighDimensional Spaces, Gaussians in high Dimensions, Johnson Lindenstrauss Lemma, Separating Gaussians, Part II. Chap.2 in FoDSc. 
TK

Fri

15/12 
1012 
E51

Lecture 21:Project presentation seminar 2 
TK, TP

Thu

11/01/2018 
TBA 
Q24, Q26, Q22 
Exam 
TK

Xy

xx/xx/2018 
TBA 
TBA 
Reexam 
TK

Welcome, we hope you will enjoy the course (and learn (sic) a lot)! Tetyana, Jimmy & Timo
To course
web page
Published by: Timo Koski
Updated:201761012 
