Conversation
…out reinitializing them I call the backward function in the second cell twice then the new gradients calculated are added or multiplied over the previous gradients
|
That's rather a feature and not a bug. Most modern deep learning frameworks actually do this because in a training loops, you may want to calculate gradient of multiple minibatches before actually optimize the parameters as you cannot compute all minibatches in a step. If you want to do that, you may want to leave to leaf nodes grad alone and only set the grad of non-leaf nodes to 0. |
|
Thank you for your explaination. I had clearly misunderstood
|
Hello Sir, I noticed a small issue in the code while watching you videos hope i provided a good solution for it
Problem: If in a google colab I had Value Objects initialized and without reinitializing them I called the backward function in the a different cell twice then the new gradients calculated are added or multiplied over the previous gradients
So i am just setting the grads to 0 when we are building_topo such that the ones that are not visited before are set to 0.0