How did you solve the problem that the loss is always NaN when training SV and MV with ESAM? And did you really successfully reproduce the ESAM paper?