Hello,
It's probable that I just misunderstood the code but I think that in the param_init_gru_cond function,
https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L390
the variable dim must equal dim_nonlin. Same is true for nin and nin_nonlin.
This is because the matrix W and Wx have dimensions (nin, 2*dim) and (nin_nonlin, dim_nonlin) respectively (https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L339).
However, both W and Wx are multiplied with state_below_ (https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L429) which would imply that nin==nin_nonlin.
Similarly, at (https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L445) r1 (of size dim) is multiplied with tensor.dot(h_, Ux) (of size dim_nonlin) which would imply dim==dim_nonlin. Is my understanding correct? If yes, is there a reason for having dim_nonlin and nin_nonlin?
Thank you.
Hello,
It's probable that I just misunderstood the code but I think that in the
param_init_gru_condfunction,https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L390
the variable
dimmust equaldim_nonlin. Same is true forninandnin_nonlin.This is because the matrix
WandWxhave dimensions(nin, 2*dim)and(nin_nonlin, dim_nonlin)respectively (https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L339).However, both
WandWxare multiplied withstate_below_(https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L429) which would imply thatnin==nin_nonlin.Similarly, at (https://github.com/nyu-dl/dl4mt-tutorial/blob/master/session3/nmt.py#L445)
r1(of sizedim) is multiplied withtensor.dot(h_, Ux)(of sizedim_nonlin) which would implydim==dim_nonlin. Is my understanding correct? If yes, is there a reason for havingdim_nonlinandnin_nonlin?Thank you.