This is the home page for BAIT 509 at The University of British Columbia, displaying the 2019 iteration of the course. The syllabus can be found at sauder_syllabus.pdf, but anything listed in on this website will take precedence.
Github repository underpinning this website: vincenzocoia/BAIT509
By the end of the course, students should be expected to be able to:
- Explain what ML is, in the context of errors and model functions.
- Understand and implement the machine learning paradigms in both R and python for a variety of ML methods.
- Identify a data table based on a machine learning problem
- Understand the types of error, and how this influences model choice/goodness
- Build and justify a ML model.
- Understand how ML fits into the greater scope of solving a business problem
At your service!
| Name | Position |
|---|---|
| Vincenzo Coia | Instructor |
| Hossameldin Mohammed | TA |
| Emily Mistick | TA |
| Arjun Baghela | TA |
Details about class meetings will appear here as they become available. Readings are optional, but should be useful.
| # | Topic | Recommended Readings |
|---|---|---|
| cm01; worksheet (.R) | Intro to the course, tools, and ML | ISLR Section 2.1 |
| cm02; worksheet (.html / .Rmd) | Irreducible and Reducible Error | ISLR Section 2.2 (you can stop in 2.2.3 once you get to the "The Bayes Classifier" subsection). |
| cm03; model fitting in python (.html / .ipynb); model fitting in R (.html / .Rmd) | Local methods | ISLR's "K-Nearest Neighbors" section (in Section 2.2.3) on page 39; and Section 7.6 ("Local Regression"). |
| cm04; cross-validation example (.R) | Model Selection | ISLR Section 5.1; we'll be touching on 6.1, 6.2, and 6.3 from ISLR, but only briefly. |
| cm05; CART example (.R) | Classification and Regression Trees | ISLR Section 8.1 |
| cm06; model function example (.R) | Refining business questions | This blog post by datapine does a good job motivating the problem of asking good questions. This blog post by altexsoft does a good job outlining the use of supervised learning in business. |
| cm07; random forest example (.R) | Ensembles | ISLR Section 8.2 |
| cm08; worksheet (.R) | Beyond the mean and mode | |
| cm09 (worksheet a continuation of yesterday's) | SVM | Section 9.1, 9.2, 9.4 in ISLR. The details aren't all that important. 9.3 is quite advanced, but I'll be discussing the main idea behind it in class. |
| cm10 SVM and cross validation worksheet (.ipynb) | SVM continuation; wrapup; alternatives to accuracy | Alternative measures, and ROC |
Want to talk about the course outside of lecture? Let's talk during these dedicated times.
| Teaching Member | When | Where |
|---|---|---|
| Arjun | Tuesdays (Jan 15 - Feb 5) 13:00-14:00 | ESB |
| Vincenzo | Wednesdays (Jan 16 - Feb 6) 10:30-11:30 | ESB 3174 |
| Emily | Wednesdays (Jan 16 - Feb 6) |
ESB |
| Hossam | Friday, January 11, 15:00-16:00 | ESB 1045 |
| Hossam | Friday, January 18, 16:00-17:00 | ESB 1045 |
| Hossam | Friday, January 25, 15:00-16:00 | ESB 3174 |
| Hossam | Friday, February 1, 15:00-16:00 | ESB 1045 |
Links to assessments will be made available when they are ready. The deadlines listed here are the official ones, and take precendence over the ones listed in the sauder syllabus.
| Assessment | Due | Weight |
|---|---|---|
| Assignment 1 (.ipynb) | January |
20% |
| Assignment 2 | January 26 at 18:00 | 20% |
| Assignment 3 | February 2 at 18:00 | 20% |
| Final Project | February 8 at 23:59 | 30% |
| Participation | January 31 at 18:00 | 10% |
Please submit everything to UBC Canvas.
- An Introduction to Statistical Learning with R (aka ISLR).
- A very well-written book covering a lot of concepts in supervised learning.