Stochastic gradient descent should be more efficient and less likely to get stuck in local minima
Stochastic gradient descent should be more efficient and less likely to get stuck in local minima