Skip to content

xcadet/poolflip

Repository files navigation

PoolFlip: A Multi-Agent Reinforcement Learning Security Environment for Cyber Defense

This repository hosts the implementation of PoolFlip, a multi-agent Gymnasium/PettingZoo environment inspired by FlipIt for studying attacker–defender dynamics in cyber defense.

Paper: link coming soon.

Contents

Abstract

Cyber defense requires automating defensive decision-making under stealthy, deceptive, and continuously evolving adversarial strategies. The FlipIt game provides a foundational framework for modeling interactions between a defender and an advanced adversary that compromises a system without being immediately detected. In FlipIt, the attacker and defender compete to control a shared resource by performing a Flip action and paying a cost. However, the existing FlipIt frameworks rely on a small number of heuristics or specialized learning techniques, which can lead to brittleness and the inability to adapt to new attacks. To address these limitations, we introduce PoolFlip, a multi-agent gym environment that extends the FlipIt game to allow efficient learning for attackers and defenders. Furthermore, we propose Flip-PSRO, a multi-agent reinforcement learning (MARL) approach that leverages population-based training to train defender agents equipped to generalize against a range of unknown, potentially adaptive opponents. Our empirical results suggest that Flip-PSRO defenders are $2\times$ more effective than baselines to generalize to a heuristic attack not exposed in training. In addition, our newly designed ownership-based utility functions ensure that Flip-PSRO defenders maintain a high level of control while optimizing performance.

Install

To install the different dependencies we use conda (see env.yaml) and pip, the following will create the environment poolflip, the required dependencies depend on your use case (see Environment only and Full stack):

conda env create -f env.yaml

Then,

conda activate poolflip

Environment only

To install only the packages for the PoolFlipEnv (usable as shown here)

pip install .

Full stack

Otherwise,

pip install .[full]

Quick start

A minimal code is provided in minimal.py

from omegaconf import OmegaConf 

from poolflip import PoolFlipEnv # The Multi Agent environment
import poolflip.agents as agents

# Configuration fo the environment
cfg = OmegaConf.create(
        {
            "num_resources": 1,
            "num_players": 2,
            "max_num_steps": 10,  # Number of steps in an episode
            "num_global_actions": 1,  # Number of global actions (1 for Sleep in the Sleep | Check | Flip setting)
            "actions_to_costs":
            {
                0: 0.0,  # Sleep: cost 0.0
                1: 2.0,  # Flip: cost 2.0
                2: 1.0,  # Check: cost 1.0
            },
        }
    )

possible_agents = {
    "defender": agents.SleepAgent(),
    "attacker": agents.PeriodicCheckAgent(phase=4, delay=1),
}

env = PoolFlipEnv(possible_agents=possible_agents, configuration=cfg)

obs, infos = env.reset(seed=42)

for step in range(cfg.max_num_steps):
    actions = {
        agent: env.action_space(agent).sample() for agent in env.agents
    }
    next_obs, rewards, terminations, truncations, infos = env.step(actions)
    if all(terminations.values()) or all(truncations.values()):
        break

env.close()

Training and Evaluation

For the following please install the full requirements:

pip install .[full]

We use MLFlow and Ray for experiment tracking and experiment parallelization.

.
├── config
│   ├── agents      <- Agent configurations
│   └── envs        <- Environment configurations
├── poolflip        <- The Environment (Usable with the minimal install)
└── flip_psro       <- The Flip-PSRO which requires the full install with MLFlow.

Export your MLFlow URI

export MLFLOW_TRACKING_URI=<MFLOW_URI>

And start your ray cluster via:

ray start --head

Training Agents (train.py)

python train.py --defender ppo_no_deterministic_eval --attacker periodic_4 --env episodes=1k_steps=100_players=2_resources=1

Will train a PPO agent against a Periodic(4) agent in a 1 resource environment with 100 steps per episodes and 1,000 episodes.

The run id generated by MLFlow can then be used in the evaluation step to load the weights of the trained agent.

Evaluating Agents (eval.py)

Assuming the previous training loop was registered under the run_id <RUN_ID>

python eval.py --defender ppo_no_deterministic_eval --attacker periodic_4 --env episodes=100_steps=100_players=2_resources=1 --defender_run_id <RUN_ID>

Would evaluate the trained PPO agent against a Periodic(4) agent in a 1 resource environment with 100 steps, over 100 episodes.

The following would evaluate the same PPO agent against a Burst(8,6) opponent.

python eval.py --defender ppo_no_deterministic_eval --attacker burst_8_6 --env episodes=100_steps=100_players=2_resources=1 --defender_run_id <RUN_ID>

Running the Flip-PSROs (flip_psro.py)

The flip_psro.py combines the two steps above. For instance, the following will run Flip-PSRO with uniform opponent selection for a randomly initialized PPO agent against a pool of opponents consisting of Periodic(4), Awakening(0.05), Burst(8,3), PeriodicCheck(4), and PAC(4) opponents.

python flip_psro.py --defender ppo_no_deterministic_eval --attacker periodic_4,awakening_05,burst_8_3,periodic_check_4,periodic_aggressive_check_4 --env episodes=100_steps=100_players=2_resources=1_expensive_check

Testsuite

pytest 

License

This repository is released under the MIT License.

It makes use of MLflow, which is licensed under the Apache License, Version 2.0.

It makes use of Ray, which is licensed under the Apache License, Version 2.0.

See https://www.apache.org/licenses/LICENSE-2.0 for details.

Citation

Proceedings link and BibTeX will be posted once available.

Acknowledgments

We would like to express our gratitude to all references in our paper that open-sourced their codebase, methodology, and dataset, which served as the foundation for our work.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages