Numerical Optimization | From-Scratch Implementation | R
This project implements a gradient descent optimization algorithm from scratch in R, applied to credit balance prediction on a high-multicollinearity dataset. The objective is algorithm design, not prediction, building a custom iterative solver and validating its mathematical integrity against standard pre-packaged regularized regression.
The implementation draws directly on numerical optimization principles (AM 230) and econometric theory (Econ 217, 294A), combining the gradient calculus of MSE minimization with matrix-form linear regression to build a solver that converges to the same solution as glmnet without using it.
Model: Linear specification Y = Xβ, solved iteratively via gradient descent rather than OLS closed-form.
Loss function:
Gradient (derived analytically):
Update rule:
with step size α = 0.01. Both the loss function and gradient are implemented as standalone modular functions with matrix input validation.
| Metric | Value |
|---|---|
| Convergence | 4,906 iterations |
| Final MSE | 9,406.32 |
| Accuracy (1 − WAPE) | 92.46% |
The custom solver converges to coefficients consistent with closed-form OLS on the structured credit dataset, validating the correctness of the gradient derivation and update implementation.
Credit Balance dataset (n = 400), designed for high multicollinearity, variables are closely correlated predictors of one another. This makes it well-suited for testing optimizer precision and convergence stability rather than feature selection.
Features: Income, Limit, Cards, Age, Education, Own, Student, Married, Region, Credit Score
Outcome: Balance (credit card balance)
Because the data is structurally aligned, the challenge is navigating the loss surface precisely, not finding the right features.
├── methods/
│ ├── mse_loss.R # MSE loss function L(β) = (1/n)Σ(y - Xβ)²
│ └── grad_mse_loss.R # Negative gradient ∇L(β) = -(2/n)Xᵀ(y - Xβ)
├── analysis/
│ └── grad_descent.Rmd # Full implementation, convergence analysis, benchmarking
├── output/
│ └── grad_descent.pdf # Rendered results and convergence plots
└── README.md
R — tidyverse, ggplot2, dplyr
Core concepts: multivariable calculus, linear algebra, numerical optimization