Practical home assignments of the base course "Machine Learning, part 1" (New Economic School, 2023). Topics per assignment:
- Intro to
pandasandmatplotlib(usedplotlyinstead) - EDA and linear regression models
RidgeandLassoon New York City Taxi Trip Duration problem data, alpha grid search - Self-built gradient descent and variations: (batch) SGD, Momentum, AdaGrad, RMSProp. Tested on data from the second assignment
- Binary classification problem, usage of
LogisticRegressionandSVC. Calibration curves, interpretation of scores. Different ways of encoding categorical variables:OrdinalEncoder,OneHotEncoder,TargetEncoder. Different ways of feature selection (after OHE): embedded methods, filet methods, wrapper methods. - Decision trees (classification), application of
DecisionTreeClassifier. Self-constructed tree classifier, Gini index. Effects of max depth, min samples in leaf, and min samples in split. - Self-built gradient boosting. Hyperparameter search, usage of
Optunapackage. Usage ofCatBoostClassifier, importance of features.