36 lines (25 loc) · 1.64 KB

Hybrid Transformer based Multi-agent Reinforcement Learning for Multiple Unmanned Aerial Vehicle Coordination in Air Corridors

Modeling

Air Corridor, Cylinder and Torus

Animation

cttc, one-transfer

4 air corridors, cylinder-torus-torus-cylinder, 12 UAVs, 4-static, and 3-mobile

cttcttcttc, 3-transfer

10 air corridors, cylinder-torus-torus-cylinder-torus-torus-cylinder-torus-torus-cylinder, 12 UAVs, 4-static, and 3-mobile

RL Training

Network Structure

Embedding network normalizes the input values and standardizes the input dimensions.
Transformer processes dynamic neighbors' information using encoders and decoders.
Actor-critic network outputs the estimated state value and stochastic action in spherical coordinates.

Training File

Train one set of parameters: main.py

Train a batch, parameter grid search: batched_grid_search.sh

Models (actor/critic) are saved every 0.25 million steps. Training process is visualized with terminal log and TensorBoard.

Test File

Serial, generate animation: D3MOVE_test_single_core.py

Parallel, generate data for figs: D3MOVE_test_parallel.py