using deterministic policy in enviroment like lunarlander?

Hi and thank you for such a genius algorithm.
I wonder how by using mu of gaussian policy in sac in enviroments like lunar lander is it guranteed to converge cuz i see some trials fails to converges. specialy on lunar lander and humnoid v3