Skip to content

Exercise 7.8 question #9

@sharov-am

Description

@sharov-am

Hi.

Why you assume that $\theta'(s^{(l)}_j)=1$ for all $\delta^{(2)}$ and $\delta^{(1)}$? We have identity output only for the final layer, all previous layers use $tanh(x)$ transformation. So, we calculate $\delta^{(2)}$ and $\delta^{(1)}$ like in example 7.1 given new value $\delta^{(3)}$, namely $\delta^{(i)} = \theta'(s^{(i)}) \otimes \left[W^{(i+1)}\delta^{(i+1)}\right]$ for $i=0,1$ where $\theta(s^{(i)}) = tanh(s^{(i)})$

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions