### Courses

##### Last Updated:
10/03/2019 - 15:02

## IAM557 - Statistical Learning and Simulation

Credit: 3(3-0); ECTS: 8.0
Instructor(s): Ömür Uğur
Prerequisites: Consent of Instructor(s)

#### Course Catalogue Description

Brief introduction to Statistical Learning: Regression versus Classification; Linear Regression: simple and multiple Linear Regression; Classification: Logistic Regression, Discriminant Analysis; Resampling Methods: Cross-Validation, the Bootstrap; Regularization: Subset Selection, Ridge Regression, the Lasso, Principle Components and Partial Least Squares Regression; Nonlinear Models: Polynomial; Splines; Generalized Additive Models; Tree-Based Models: Decision Trees, Random Forest, Boosting; Support Vector Machines; Unsupervised Learning: Principle Component Analysis, Clustering Methods.

#### Course Objectives

At the end of the course, the student will learn:
• the fundamentals of Statistical Learning, regression and classification
• linear and nonlinear regressions including splines
• Generalised Additive Models for both regression and classification problems
• regularisation techniques including Ridge regression and the Lasso
• the tree-based methods for regression and classification
• Support Vector Machine which is highly appreciated among Data Science and Machine Learning Community
• the difference between supervised and unsupervised learning methods

#### Course Learning Outcomes

Student, who passed the course satisfactorily will be able to:
• present the data and its descriptive analysis
• distinguish between regression and classification problems
• apply regression or classification algorithms to solve related problems
• code their own algorithms for specific applications in Statistical and Machine Learning
• understand the fundamentals of Support Vector Machine and be able to apply to specific problems
• distinguish between supervised and unsupervised learning methods in related applications

#### Tentative (Weekly) Outline

1. Brief introduction to Statistical Learning: a) Regression versus Classification
2. Linear Regression: a) simple and multiple Linear Regression
3. Classification: a) Logistic Regression b) Discriminant Analysis (Linear and Quadratic)
4. Resampling Methods: a) Cross-Validation b) the Bootstrap
5. Regularisation: a) Subset Selection b) Ridge Regression c) the Lasso d) Principle Components Regression e) Partial Least Squares Regression
6. Nonlinear Models: a) Polynomial and Splines b) Generalised Additive Models
7. Tree-Based Models: a) Decision Trees b) Random Forest c) Boosting
8. Support Vector Machines
9. Unsupervised Learning: a) Principle Component Analysis b) Clustering Methods

#### Course Textbook(s)

• Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, An Introduction to Statistical Learning - with Applications in R, 8th ed., Springer, 2013 (Corrected at 8th printing 2017)

#### Supplementary Materials and Resources

• Books:
• Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., Springer, 2009 (Corrected at 12th printing 2017)
• Kevin P. Murphy, Machine Learning: A Probabilistic Perspective, The MIT Press, 2012
• Peter Harrington, Machine Learning in Action, Manning Publications Co., 2012
• Charu C. Aggarwal, Neural Networks and Deep Learning: A Textbook, Springer, 2018
• G. Jay Kerns, Introduction to Probability and Statistics Using R, 1st ed., 2015
• Robert V. Hogg, Elliot A. Tanis, Dale Zimmerman, Probability and Statistical Inference, 9th ed., 2015
• Larry Wasserman, All of Statistics - A Concise Course in Statistical Inference, 2004
• W. N. Venables, D. M. Smith, and the R Core Team, An Introduction to R - Notes on R: A Programming Environment for Data Analysis and Graphics, Version 3.4.2 (2017-09-28)
• Resources:
• The R Project for Statistical Computing: https://www.r-project.org/
• python: https://www.python.org/
• RStudio: https://www.rstudio.com/
• Anaconda: https://www.anaconda.com/