MARL Environment and actor critic models for multi-agent cooperation using deep reinforcement learning