In the implementation of needle.nn.Dropout in hw2, cause it does not require using randb in init module (which uses numpy rand) to generate samples of Bernoulli distribution, I implement it with np.random.binomial, which will pass the dropout test, but will fail the final Train Mnist test.
in a simle word, I implement dropout as:
total = math.prod([dim for dim in x.shape])
bernoulli_samples = np.random.binomial(n=1, p=1 - self.p, size=total).astype(x.dtype)
bernoulli_samples = Tensor(bernoulli_samples, requires_grad=False).reshape(x.shape)
return (bernoulli_samples * x) / (1.0 - self.p)
which passes the dropout unit test but will fail the final Train Mnist test.
but if I implement it with init module:
bernoulli_samples = init.randb(*x.shape, p=(1 - self.p))
return (bernoulli_samples * x) / (1.0 - self.p)
which will pass both tests.
and the root of cause is because the opposite behaviour between np.random.binomial and np.random.rand when prob sets to 0.5.
I hope in the document of hw2 dropout requirement, the using of init module should be specified, otherwise it will cause lots of time to debug this.
In the implementation of needle.nn.Dropout in hw2, cause it does not require using randb in init module (which uses numpy rand) to generate samples of Bernoulli distribution, I implement it with np.random.binomial, which will pass the dropout test, but will fail the final Train Mnist test.
in a simle word, I implement dropout as:
total = math.prod([dim for dim in x.shape])
bernoulli_samples = np.random.binomial(n=1, p=1 - self.p, size=total).astype(x.dtype)
bernoulli_samples = Tensor(bernoulli_samples, requires_grad=False).reshape(x.shape)
return (bernoulli_samples * x) / (1.0 - self.p)
which passes the dropout unit test but will fail the final Train Mnist test.
but if I implement it with init module:
bernoulli_samples = init.randb(*x.shape, p=(1 - self.p))
return (bernoulli_samples * x) / (1.0 - self.p)
which will pass both tests.
and the root of cause is because the opposite behaviour between np.random.binomial and np.random.rand when prob sets to 0.5.
I hope in the document of hw2 dropout requirement, the using of init module should be specified, otherwise it will cause lots of time to debug this.