athms · lnnrtwttkhn · Dec 8, 2020 · Dec 8, 2020 · Dec 8, 2020 · Dec 8, 2020
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1 @@
+.ipynb_checkpoints
diff --git a/0-Perceptron-Gradient-Descent.ipynb b/0-Perceptron-Gradient-Descent.ipynb
@@ -86,7 +86,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "`data`is a dictionary with the following elements: "
+    "`data` is a dictionary with the following elements: "
    ]
   },
   {
@@ -319,7 +319,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# 2. Distinguishing setosa and versicolor by petal length and witdh"
+    "# 2. Distinguishing setosa and versicolor by petal length and width"
    ]
   },
   {
@@ -445,7 +445,7 @@
    "source": [
     "The perceptron algorithm is conceptually inspired by biological neurons.\n",
     "\n",
-    "A biological neuron receives signals of variable manitude through its dendrites. These input signals are then accumulated in the cell body; If the accumulated signal exceeds a certain threshold, the neuron outputs a signal through its axon:"
+    "A biological neuron receives signals of variable magnitude through its dendrites. These input signals are then accumulated in the cell body; If the accumulated signal exceeds a certain threshold, the neuron outputs a signal through its axon:"
    ]
   },
   {
@@ -532,7 +532,7 @@
    "source": [
     "The classical percpetron algorithm uses a step activation function, which outputs a value of $1$ if the weighted sum is bigger than $0$ and $-1$ otherwise.\n",
     "\n",
-    "Here, we will be using a sigmoid activation function instead. The sigmoid scales the weighted sum to a value between 0 and 1, indicating the probability that the data instance belongs to class 1 (ie., versicolor irises)."
+    "Here, we will be using a sigmoid activation function instead. The sigmoid scales the weighted sum to a value between 0 and 1, indicating the probability that the data instance belongs to class 1 (i.e., versicolor irises)."
    ]
   },
   {
@@ -714,8 +714,8 @@
     "        \"\"\"the cross-entropy loss:\n",
     "        \n",
     "        Args:\n",
-    "            y (array): labels for each insatance (0 or 1)\n",
-    "            y_pred (array): predicted probabilty that\n",
+    "            y (array): labels for each instance (0 or 1)\n",
+    "            y_pred (array): predicted probability that\n",
     "                each instance belongs to class 1\n",
     "        \"\"\"\n",
     "        loss = -(y * np.log(y_pred + zerotol) + (1 - y) * np.log(1 - y_pred + zerotol))\n",
@@ -842,7 +842,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# 5. (Stochastic) gradinet descent"
+    "# 5. (Stochastic) gradient descent"
    ]
   },
   {
@@ -1005,7 +1005,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To give better insight into the process, we will perform each gradient descent step manually, by the use of the `predict`, `loss`, and `update_weights` functions of our `perceptron` (This process is otherwise also wrapped in the `train` functionn).\n",
+    "To give better insight into the process, we will perform each gradient descent step manually, by the use of the `predict`, `loss`, and `update_weights` functions of our `perceptron` (This process is otherwise also wrapped in the `train` function).\n",
     "\n",
     "Note that we do not use the full dataset to update our weights at each iteration, but instead draw a random sample. We use this random sample to compute an estimate of the gradient. This procedure is called [*stochastic* gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) and is common in the machine learning literature for large datasets (for which it would be otherwise too costly to compute the gradient over the entire dataset at each iteration)."
    ]
@@ -1112,7 +1112,7 @@
    "source": [
     "Interesting; It looks like we choose a learning rate that was too large: The first gradient descent step took us all the way across the valley!\n",
     "\n",
-    "Luckily, we then from there traveled safely down to a minimum."
+    "Luckily, from there we then traveled safely down to a minimum."
    ]
   },
   {

diff --git a/1-Neural-Networks-Backpropagation.ipynb b/1-Neural-Networks-Backpropagation.ipynb
@@ -119,7 +119,7 @@
    "source": [
     "Couldn't we just combine multiple perceptrons to solve this classification problem?\n",
     "\n",
-    "We could train one perceptron to distinguish the red point cloud in the lower left from all others and another perceptron to distinguish the red point cloud in the top right from all others. Subsequently, we could train a third perceptron based on the predictions of the first two: if either of the first two predicts that a data point belongs to their target class (so if either predicts $y=1$), the thrid perceptron would also predict $y=1$. "
+    "We could train one perceptron to distinguish the red point cloud in the lower left from all others and another perceptron to distinguish the red point cloud in the top right from all others. Subsequently, we could train a third perceptron based on the predictions of the first two: if either of the first two predicts that a data point belongs to their target class (so if either predicts $y=1$), the third perceptron would also predict $y=1$. "
    ]
   },
   {
@@ -1673,7 +1673,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "By the use of our handy `plot_training_stats` function, we can get a quick overview of the training statistics as well as the networks perdictive performance in the training and test data:"
+    "By the use of our handy `plot_training_stats` function, we can get a quick overview of the training statistics as well as the network's predictive performance in the training and test data:"
    ]
   },
   {

diff --git a/README.md b/README.md
@@ -3,4 +3,36 @@
 Run this code with Jupyter Binder:
 https://mybinder.org/v2/gh/athms/deep-learning-basics/HEAD
 
-[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/athms/deep-learning-basics/HEAD)
+[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/athms/deep-learning-basics/HEAD)
+
+## Usage
+
+### Run the Jupyter notebooks locally
+
+1. Clone this repo
+2. Install the required packages listed in [`requirements.txt`](requirements.txt), ideally in a virtual environment, e.g.:
+
+```bash
+$ mkvirtualenv deep-learning-basics -p python3 -r requirements.txt
+```
+
+This command used the [virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/index.html) to create a virtual environment called `deep-learning-basics` with the locally installed version of Python3 (here, Python 3.8.6, see below) and install all required packages listed in [`requirements.txt`](requirements.txt).
+
+```bash
+$ python3 --version
+Python 3.8.6
+```
+
+3. To run the Jupyter kernel inside the virtual environment, you need to run the kernel self-install inside the virtual environment:
+
+```bash
+$ python -m ipykernel install --user --name=deep-learning-basics
+```
+
+4. To start the Jupyter interface, run
+
+```
+jupyter notebook
+```
+
+4. Finally, switch the kernel (named `deep-learning-basics` here) in the Jupyter user interface.