Hi!
The method BLASEngine<cpu,float>::gemm in MSHADOW_STAND_ALONE branch
contains the error in the shape of dst tensor(dot_engine-inl.h).
It is written
Tensor<cpu, 2, float> lhs((float*)B, Shape2(transpose_left ? k : n, transpose_left ? n : k)); // NOLINT()
Tensor<cpu, 2, float> rhs((float)A, Shape2(transpose_right ? m : k, transpose_right ? k : m)); // NOLINT(*)
Tensor<cpu, 2, float> dst(C, Shape2(m, n)); <-- (wrong shape)
Must be
Tensor<cpu, 2, float> dst(C, Shape2(n, m));
This error occurs when executing operator 'dot' and 'batch_dot' in JavaScript