From 39083bd8438dc4de6b643c496018c84ca38af8a8 Mon Sep 17 00:00:00 2001
From: Lennart Wittkuhn <wittkuhn@mpib-berlin.mpg.de>
Date: Tue, 8 Dec 2020 10:41:21 +0100
Subject: [PATCH 1/4] add .gitignore and start ignoring .ipynb_checkpoints

---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 .gitignore

diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..763513e
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1 @@
+.ipynb_checkpoints

From 3b8630de75e0bbaac7ecab6c0a81a6c7b1607d00 Mon Sep 17 00:00:00 2001
From: Lennart Wittkuhn <wittkuhn@mpib-berlin.mpg.de>
Date: Tue, 8 Dec 2020 11:09:13 +0100
Subject: [PATCH 2/4] fix a few minor typos

---
 0-Perceptron-Gradient-Descent.ipynb | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/0-Perceptron-Gradient-Descent.ipynb b/0-Perceptron-Gradient-Descent.ipynb
index d0bef90..6d4821d 100644
--- a/0-Perceptron-Gradient-Descent.ipynb
+++ b/0-Perceptron-Gradient-Descent.ipynb
@@ -86,7 +86,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "`data`is a dictionary with the following elements: "
+    "`data` is a dictionary with the following elements: "
    ]
   },
   {
@@ -319,7 +319,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# 2. Distinguishing setosa and versicolor by petal length and witdh"
+    "# 2. Distinguishing setosa and versicolor by petal length and width"
    ]
   },
   {
@@ -445,7 +445,7 @@
    "source": [
     "The perceptron algorithm is conceptually inspired by biological neurons.\n",
     "\n",
-    "A biological neuron receives signals of variable manitude through its dendrites. These input signals are then accumulated in the cell body; If the accumulated signal exceeds a certain threshold, the neuron outputs a signal through its axon:"
+    "A biological neuron receives signals of variable magnitude through its dendrites. These input signals are then accumulated in the cell body; If the accumulated signal exceeds a certain threshold, the neuron outputs a signal through its axon:"
    ]
   },
   {
@@ -532,7 +532,7 @@
    "source": [
     "The classical percpetron algorithm uses a step activation function, which outputs a value of $1$ if the weighted sum is bigger than $0$ and $-1$ otherwise.\n",
     "\n",
-    "Here, we will be using a sigmoid activation function instead. The sigmoid scales the weighted sum to a value between 0 and 1, indicating the probability that the data instance belongs to class 1 (ie., versicolor irises)."
+    "Here, we will be using a sigmoid activation function instead. The sigmoid scales the weighted sum to a value between 0 and 1, indicating the probability that the data instance belongs to class 1 (i.e., versicolor irises)."
    ]
   },
   {
@@ -714,8 +714,8 @@
     "        \"\"\"the cross-entropy loss:\n",
     "        \n",
     "        Args:\n",
-    "            y (array): labels for each insatance (0 or 1)\n",
-    "            y_pred (array): predicted probabilty that\n",
+    "            y (array): labels for each instance (0 or 1)\n",
+    "            y_pred (array): predicted probability that\n",
     "                each instance belongs to class 1\n",
     "        \"\"\"\n",
     "        loss = -(y * np.log(y_pred + zerotol) + (1 - y) * np.log(1 - y_pred + zerotol))\n",
@@ -842,7 +842,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# 5. (Stochastic) gradinet descent"
+    "# 5. (Stochastic) gradient descent"
    ]
   },
   {
@@ -1005,7 +1005,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To give better insight into the process, we will perform each gradient descent step manually, by the use of the `predict`, `loss`, and `update_weights` functions of our `perceptron` (This process is otherwise also wrapped in the `train` functionn).\n",
+    "To give better insight into the process, we will perform each gradient descent step manually, by the use of the `predict`, `loss`, and `update_weights` functions of our `perceptron` (This process is otherwise also wrapped in the `train` function).\n",
     "\n",
     "Note that we do not use the full dataset to update our weights at each iteration, but instead draw a random sample. We use this random sample to compute an estimate of the gradient. This procedure is called [*stochastic* gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) and is common in the machine learning literature for large datasets (for which it would be otherwise too costly to compute the gradient over the entire dataset at each iteration)."
    ]
@@ -1112,7 +1112,7 @@
    "source": [
     "Interesting; It looks like we choose a learning rate that was too large: The first gradient descent step took us all the way across the valley!\n",
     "\n",
-    "Luckily, we then from there traveled safely down to a minimum."
+    "Luckily, from there we then traveled safely down to a minimum."
    ]
   },
   {

From 3f6cc1c44d72d058bb1f10a1d9bdbc4ad049bcf8 Mon Sep 17 00:00:00 2001
From: Lennart Wittkuhn <wittkuhn@mpib-berlin.mpg.de>
Date: Tue, 8 Dec 2020 11:10:03 +0100
Subject: [PATCH 3/4] add instructions how run jupyter notebooks locally

---
 README.md | 34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index a1d57d1..40c4d2f 100644
--- a/README.md
+++ b/README.md
@@ -3,4 +3,36 @@
 Run this code with Jupyter Binder:
 https://mybinder.org/v2/gh/athms/deep-learning-basics/HEAD
 
-[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/athms/deep-learning-basics/HEAD)
\ No newline at end of file
+[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/athms/deep-learning-basics/HEAD)
+
+## Usage
+
+### Run the Jupyter notebooks locally
+
+1. Clone this repo
+2. Install the required packages listed in [`requirements.txt`](requirements.txt), ideally in a virtual environment, e.g.:
+
+```bash
+$ mkvirtualenv deep-learning-basics -p python3 -r requirements.txt
+```
+
+This command used the [virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/index.html) to create a virtual environment called `deep-learning-basics` with the locally installed version of Python3 (here, Python 3.8.6, see below) and install all required packages listed in [`requirements.txt`](requirements.txt).
+
+```bash
+$ python3 --version
+Python 3.8.6
+```
+
+3. To run the Jupyter kernel inside the virtual environment, you need to run the kernel self-install inside the virtual environment:
+
+```bash
+$ python -m ipykernel install --user --name=deep-learning-basics
+```
+
+4. To start the Jupyter interface, run
+
+```
+jupyter notebook
+```
+
+4. Finally, switch the kernel (named `deep-learning-basics` here) in the Jupyter user interface.

From 165f1403a381f8af1afe0bbce6447ac675f9d4bd Mon Sep 17 00:00:00 2001
From: Lennart Wittkuhn <wittkuhn@mpib-berlin.mpg.de>
Date: Tue, 8 Dec 2020 11:44:30 +0100
Subject: [PATCH 4/4] fix minor typos

---
 1-Neural-Networks-Backpropagation.ipynb | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/1-Neural-Networks-Backpropagation.ipynb b/1-Neural-Networks-Backpropagation.ipynb
index 559c29c..d1df37d 100644
--- a/1-Neural-Networks-Backpropagation.ipynb
+++ b/1-Neural-Networks-Backpropagation.ipynb
@@ -119,7 +119,7 @@
    "source": [
     "Couldn't we just combine multiple perceptrons to solve this classification problem?\n",
     "\n",
-    "We could train one perceptron to distinguish the red point cloud in the lower left from all others and another perceptron to distinguish the red point cloud in the top right from all others. Subsequently, we could train a third perceptron based on the predictions of the first two: if either of the first two predicts that a data point belongs to their target class (so if either predicts $y=1$), the thrid perceptron would also predict $y=1$. "
+    "We could train one perceptron to distinguish the red point cloud in the lower left from all others and another perceptron to distinguish the red point cloud in the top right from all others. Subsequently, we could train a third perceptron based on the predictions of the first two: if either of the first two predicts that a data point belongs to their target class (so if either predicts $y=1$), the third perceptron would also predict $y=1$. "
    ]
   },
   {
@@ -1673,7 +1673,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "By the use of our handy `plot_training_stats` function, we can get a quick overview of the training statistics as well as the networks perdictive performance in the training and test data:"
+    "By the use of our handy `plot_training_stats` function, we can get a quick overview of the training statistics as well as the network's predictive performance in the training and test data:"
    ]
   },
   {