Hello, I think your dilated convolution class, as implemented, is not causal because of the 0 padding. The last comment here proposes a solution that might work better: https://github.com/pytorch/pytorch/issues/1333