SAC_Off_policy_and_Offline Implementation of SAC Algorithm in both off-policy and offline Reinforcement Learning