In this book club we go beyond the classical statistical methods for modelling, such as linear regression. As computing power has increased over the last 20 years many new, highly computational, “Statistical Learning”, methods have been developed.

In particular the last decade has seen a significant expansion of the number of possible approaches. This book club will provide a very applied overview to such modern methods as Generalized Additive Models, Decision Trees, Boosting, Bagging and Support Vector Machines as well as more classical linear approaches such as Logistic Regression, Linear Discriminant Analysis, K-Means Clustering and Nearest Neighbors.

We will also introduce code and possibly discuss some of the exercises in the book.

The current plan is to meet **bi-weekly** starting on **Thursday, Nov. 6th 2014** in **ATC B11**. We will try to cover half a chapter per session (except for chapter 2) and possibly add sessions on biological data analysis problems.

**Please register here to receive the announcements** (Internal access only)

We will discuss

An Introduction to Statistical Learning with Applications in R ,which is the more accessible little brother of a modern classic in the machine learning literature called Elements of Statistical Learning and can be downloaded for free. A first set of lectures and slides is here

Additionally, there are Lecture slides and videos from a recent course taught by two of the authors available. Additionally, there are solutions to the exercises on github.

Date | Time | Room | Chapter | Topics |
---|---|---|---|---|

6.11.14 | 17:00 | B11 | Chapter 2 | Intro to Stat. Learning, Prediction Accuracy vs Model Interpretability, Bias-Variance Trade-Off |

2.12.14 |
17:00 | A23 |
Chapter 3 I | Linear Regression till 3.3.1 “Qualitative Predictors” |

11.12.14 | 17:00 | A23 | Chapter 3 II | Rest of chapter 3, |

08.01.15 | 17:00 | B11 | Chapter 3 III | R-lab on regression of chapter 3, Link to material |

22.01.15 | 17:00 | B11 | Chapter 4 I | Logistic Regression, Linear Discriminant analysis (up to 4.4.4) |

05.02.15 | 16:00 | A23 | Chapter 4 II | Logistic regression, microarray classification examples |

19.02.15 | 16:00 | B11 | Chapter 5 I | Cross Validation and Bootstrap, see Tim Hesterberg’s nice review on resampling |

05.03.15 | 16:00 | B11 | Chapter 5 II | Lab CV, feature selection using CV and caveats, Link to material, paper on selection bias |

26.03.15 | 16:00 | A23 | Chapter 6 I | Regularization: Subset regression, ridge and lasso penalties — paper on p–values for the lasso |

16.04.15 | 16:00 | A23 | Chapter 6 II | Lasso and feature selection: application to QTL mapping |

07.05.15 | 16:00 | A23 | Chapter 7 I | Splines and local regression |

21.05.15 | 16:00 | B11 | Chapter 7 II | Non–linear / smoothing methods: Lab with examples from RNA-Seq and HiC |

09.02.16 | 16:15 |
B11 | Chapter 8 / 9 | Introduction to supervised learning with trees, random forests and SVMs. For additional material, see the caret package and accompanying book as well a recent evaluation of classifiers and of course DJ Hands classical paper on the illusion of progress in classifier technology. |

18.02.16 | 16:00 | B11 | Chapter 8 / 9 | Case studies on classification with metagenomics data with Georg Zeller, see e.g. this paper. |