Skip to content

Latest commit

 

History

History
36 lines (25 loc) · 1.64 KB

File metadata and controls

36 lines (25 loc) · 1.64 KB

Hybrid Transformer based Multi-agent Reinforcement Learning for Multiple Unmanned Aerial Vehicle Coordination in Air Corridors

Modeling

Air Corridor, Cylinder and Torus

Air_corridor.jpg

Animation

cttc, one-transfer

4 air corridors, cylinder-torus-torus-cylinder, 12 UAVs, 4-static, and 3-mobile

cttc_12.gif

cttcttcttc, 3-transfer

10 air corridors, cylinder-torus-torus-cylinder-torus-torus-cylinder-torus-torus-cylinder, 12 UAVs, 4-static, and 3-mobile

cttcttcttc_12.gif

RL Training

Network Structure

  • Embedding network normalizes the input values and standardizes the input dimensions.
  • Transformer processes dynamic neighbors' information using encoders and decoders.
  • Actor-critic network outputs the estimated state value and stochastic action in spherical coordinates. TransRL.jpg

Training File

Train one set of parameters: main.py

Train a batch, parameter grid search: batched_grid_search.sh

Models (actor/critic) are saved every 0.25 million steps. Training process is visualized with terminal log and TensorBoard.

Test File

Serial, generate animation: D3MOVE_test_single_core.py

Parallel, generate data for figs: D3MOVE_test_parallel.py