Refactor linear regression weight calculation method

carlportz · carlportz · commit 05d28b221f48 · 2025-07-10T14:04:19.000+02:00
- Updated the weight calculation in the LinearRegression class to use the analytical solution with the normal equation instead of the Moore-Penrose pseudoinverse.
- Modified the exercise documentation to clarify the implications of using the pseudoinverse and the conditions under which it may fail, enhancing the educational context for users.
diff --git a/src/codes/05-machine_learning/exercise_05.py b/src/codes/05-machine_learning/exercise_05.py
@@ -7,7 +7,8 @@ def __init__(self):
         self.weights = None
     
     def fit(self, X, y):
-        self.weights = np.linalg.pinv(X) @ y
+        # self.weights = np.linalg.pinv(X) @ y
+        self.weights = np.linalg.inv(X.T @ X) @ X.T @ y
     
     def predict(self, X):
         return X @ self.weights
diff --git a/src/psets/04.md b/src/psets/04.md
@@ -8,13 +8,31 @@ In the lecture, we performed linear regression to predict the fluorescence inten
 
 **(a) Linear Regression with Morgan Fingerprints**
 
-Use your object-oriented implementation of linear regression to perform linear regression on the dataset using Morgan fingerprints as features. To reduce the number of features, you can use `pandas` to drop all columns that have a standard deviation of 0, as they are not informative. Compute the mean absolute error (MAE) and plot the predicted vs. actual fluorescence intensity.
+Use your object-oriented implementation of linear regression to perform linear regression on the dataset using Morgan fingerprints as features. What do you observe?
+
+**(b) Moore-Penrose Pseudoinverse**
+
+You will see, that `numpy` will probably throw an error or the form `numpy.linalg.LinAlgError: Singular matrix`, indicating that the matrix $\bm{X}^T \bm{X}$ is not invertible. This means that the columns of $\bm{X}$ are linearly dependent, i.e., you can express one column as a linear combination of the other columns.
+
+To solve this problem, we can use the Moore-Penrose pseudoinverse $\mathbf{X}^+$, which you have got to know in the lecture on SVD. Show that the analytical solution for the weights for linear regression 
+
+$$
+\vec{w} = (\bm{X}^T \bm{X})^{-1} \bm{X}^T \bm{y}
+$$
+
+is equivalent to:
+
+$$
+\vec{w} = \bm{X}^+ \bm{y}
+$$
+
+Then use `np.linalg.pinv` to compute the pseudoinverse and use it to compute the weights for linear regression. Compute the MAE and plot the predicted vs. actual fluorescence intensity.
 
 How severe do you estimate the degree of overfitting to be? Consider the number of weights in relation to the number of data points.
 
 **(b) Ridge Regression**
 
-Now perform ridge regression using the same dataset. Remember that ridge regression is linear regression with a regularization term added to the loss function:
+To reduce the risk of overfitting, we can use ridge regression. Remember that ridge regression is linear regression with a regularization term added to the loss function:
 
 $$
 \mathcal{L} = \frac{1}{2} \sum_{i=1}^{N} (y_i - \hat{f}(\vec{x}_i))^2 + \frac{\lambda}{2} \|\vec{w}\|^2