Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.ipynb_checkpoints
18 changes: 9 additions & 9 deletions 0-Perceptron-Gradient-Descent.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"`data`is a dictionary with the following elements: "
"`data` is a dictionary with the following elements: "
]
},
{
Expand Down Expand Up @@ -319,7 +319,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. Distinguishing setosa and versicolor by petal length and witdh"
"# 2. Distinguishing setosa and versicolor by petal length and width"
]
},
{
Expand Down Expand Up @@ -445,7 +445,7 @@
"source": [
"The perceptron algorithm is conceptually inspired by biological neurons.\n",
"\n",
"A biological neuron receives signals of variable manitude through its dendrites. These input signals are then accumulated in the cell body; If the accumulated signal exceeds a certain threshold, the neuron outputs a signal through its axon:"
"A biological neuron receives signals of variable magnitude through its dendrites. These input signals are then accumulated in the cell body; If the accumulated signal exceeds a certain threshold, the neuron outputs a signal through its axon:"
]
},
{
Expand Down Expand Up @@ -532,7 +532,7 @@
"source": [
"The classical percpetron algorithm uses a step activation function, which outputs a value of $1$ if the weighted sum is bigger than $0$ and $-1$ otherwise.\n",
"\n",
"Here, we will be using a sigmoid activation function instead. The sigmoid scales the weighted sum to a value between 0 and 1, indicating the probability that the data instance belongs to class 1 (ie., versicolor irises)."
"Here, we will be using a sigmoid activation function instead. The sigmoid scales the weighted sum to a value between 0 and 1, indicating the probability that the data instance belongs to class 1 (i.e., versicolor irises)."
]
},
{
Expand Down Expand Up @@ -714,8 +714,8 @@
" \"\"\"the cross-entropy loss:\n",
" \n",
" Args:\n",
" y (array): labels for each insatance (0 or 1)\n",
" y_pred (array): predicted probabilty that\n",
" y (array): labels for each instance (0 or 1)\n",
" y_pred (array): predicted probability that\n",
" each instance belongs to class 1\n",
" \"\"\"\n",
" loss = -(y * np.log(y_pred + zerotol) + (1 - y) * np.log(1 - y_pred + zerotol))\n",
Expand Down Expand Up @@ -842,7 +842,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5. (Stochastic) gradinet descent"
"# 5. (Stochastic) gradient descent"
]
},
{
Expand Down Expand Up @@ -1005,7 +1005,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"To give better insight into the process, we will perform each gradient descent step manually, by the use of the `predict`, `loss`, and `update_weights` functions of our `perceptron` (This process is otherwise also wrapped in the `train` functionn).\n",
"To give better insight into the process, we will perform each gradient descent step manually, by the use of the `predict`, `loss`, and `update_weights` functions of our `perceptron` (This process is otherwise also wrapped in the `train` function).\n",
"\n",
"Note that we do not use the full dataset to update our weights at each iteration, but instead draw a random sample. We use this random sample to compute an estimate of the gradient. This procedure is called [*stochastic* gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) and is common in the machine learning literature for large datasets (for which it would be otherwise too costly to compute the gradient over the entire dataset at each iteration)."
]
Expand Down Expand Up @@ -1112,7 +1112,7 @@
"source": [
"Interesting; It looks like we choose a learning rate that was too large: The first gradient descent step took us all the way across the valley!\n",
"\n",
"Luckily, we then from there traveled safely down to a minimum."
"Luckily, from there we then traveled safely down to a minimum."
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions 1-Neural-Networks-Backpropagation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@
"source": [
"Couldn't we just combine multiple perceptrons to solve this classification problem?\n",
"\n",
"We could train one perceptron to distinguish the red point cloud in the lower left from all others and another perceptron to distinguish the red point cloud in the top right from all others. Subsequently, we could train a third perceptron based on the predictions of the first two: if either of the first two predicts that a data point belongs to their target class (so if either predicts $y=1$), the thrid perceptron would also predict $y=1$. "
"We could train one perceptron to distinguish the red point cloud in the lower left from all others and another perceptron to distinguish the red point cloud in the top right from all others. Subsequently, we could train a third perceptron based on the predictions of the first two: if either of the first two predicts that a data point belongs to their target class (so if either predicts $y=1$), the third perceptron would also predict $y=1$. "
]
},
{
Expand Down Expand Up @@ -1673,7 +1673,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"By the use of our handy `plot_training_stats` function, we can get a quick overview of the training statistics as well as the networks perdictive performance in the training and test data:"
"By the use of our handy `plot_training_stats` function, we can get a quick overview of the training statistics as well as the network's predictive performance in the training and test data:"
]
},
{
Expand Down
34 changes: 33 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,36 @@
Run this code with Jupyter Binder:
https://mybinder.org/v2/gh/athms/deep-learning-basics/HEAD

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/athms/deep-learning-basics/HEAD)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/athms/deep-learning-basics/HEAD)

## Usage

### Run the Jupyter notebooks locally

1. Clone this repo
2. Install the required packages listed in [`requirements.txt`](requirements.txt), ideally in a virtual environment, e.g.:

```bash
$ mkvirtualenv deep-learning-basics -p python3 -r requirements.txt
```

This command used the [virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/index.html) to create a virtual environment called `deep-learning-basics` with the locally installed version of Python3 (here, Python 3.8.6, see below) and install all required packages listed in [`requirements.txt`](requirements.txt).

```bash
$ python3 --version
Python 3.8.6
```

3. To run the Jupyter kernel inside the virtual environment, you need to run the kernel self-install inside the virtual environment:

```bash
$ python -m ipykernel install --user --name=deep-learning-basics
```

4. To start the Jupyter interface, run

```
jupyter notebook
```

4. Finally, switch the kernel (named `deep-learning-basics` here) in the Jupyter user interface.