|
148 | 148 | None, |
149 | 149 | 'setting-up-the-back-propagation-algorithm-part-3'), |
150 | 150 | ('Updating the gradients', 2, None, 'updating-the-gradients'), |
151 | | - ('Example to the calculation of gradient', |
152 | | - 2, |
153 | | - None, |
154 | | - 'example-to-the-calculation-of-gradient'), |
155 | 151 | ('Fine-tuning neural network hyperparameters', |
156 | 152 | 2, |
157 | 153 | None, |
|
301 | 297 | <!-- navigation toc: --> <li><a href="#setting-up-the-back-propagation-algorithm-part-2" style="font-size: 80%;"><b>Setting up the back propagation algorithm, part 2</b></a></li> |
302 | 298 | <!-- navigation toc: --> <li><a href="#setting-up-the-back-propagation-algorithm-part-3" style="font-size: 80%;"><b>Setting up the Back propagation algorithm, part 3</b></a></li> |
303 | 299 | <!-- navigation toc: --> <li><a href="#updating-the-gradients" style="font-size: 80%;"><b>Updating the gradients</b></a></li> |
304 | | - <!-- navigation toc: --> <li><a href="#example-to-the-calculation-of-gradient" style="font-size: 80%;"><b>Example to the calculation of gradient</b></a></li> |
305 | 300 | <!-- navigation toc: --> <li><a href="#fine-tuning-neural-network-hyperparameters" style="font-size: 80%;"><b>Fine-tuning neural network hyperparameters</b></a></li> |
306 | 301 | <!-- navigation toc: --> <li><a href="#hidden-layers" style="font-size: 80%;"><b>Hidden layers</b></a></li> |
307 | 302 | <!-- navigation toc: --> <li><a href="#which-activation-function-should-i-use" style="font-size: 80%;"><b>Which activation function should I use?</b></a></li> |
@@ -942,27 +937,6 @@ <h2 id="updating-the-gradients" class="anchor">Updating the gradients </h2> |
942 | 937 | $$ |
943 | 938 |
|
944 | 939 |
|
945 | | -<!-- !split --> |
946 | | -<h2 id="example-to-the-calculation-of-gradient" class="anchor">Example to the calculation of gradient </h2> |
947 | | - |
948 | | -<p>Consider a simple NN in which the inputs \( \boldsymbol{x} \), the weights |
949 | | -\( \boldsymbol{W} \), the biases \( \boldsymbol{b} \) and the ouputs |
950 | | -\( \boldsymbol{\tilde{y}}=f(\boldsymbol{x};\boldsymbol{\Theta}) \) are just scalars and that we |
951 | | -have two layers only, that is the output layer is labeled with \( L=2 \). |
952 | | -</p> |
953 | | - |
954 | | -<p>Our output is then (no boldfaced symbols since all quantities are scalars)</p> |
955 | | -$$ |
956 | | -\tilde{y}=f(x;Theta))=\sigma_{L=2}(w_2\sigma_1(w_1x+b_1)+b_2). |
957 | | -$$ |
958 | | - |
959 | | -<p>For the back-propagation algorithm we will need various partial derivatives. One of these is</p> |
960 | | -$$ |
961 | | -\frac{\partial f(x;\Theta)}{\partial w_1}= |
962 | | -$$ |
963 | | - |
964 | | -<!-- \sigma_2(w_2\sigma_1(w_1x+b_1)+b_2)\times w_2\sigma_1(w_1x+b_1)x. --> |
965 | | - |
966 | 940 | <!-- !split --> |
967 | 941 | <h2 id="fine-tuning-neural-network-hyperparameters" class="anchor">Fine-tuning neural network hyperparameters </h2> |
968 | 942 |
|
|
0 commit comments