at each SGD iteration sample the gradient multiple times (10-50) and compute variance.
at each SGD iteration sample the gradient multiple times (10-50) and compute variance.