Code for the paper "A DeepReinforcement Learning Framework for Multi-Period Facility Location in Decentralized Pharmaceutical Supply Chain Networks".
OpenDeepPharmaSC/
├── src/opdeeppharmasc/ # Minimal package (model, problem, utilities)
├── data/
│ ├── demand/ # Demand matrices used in resilience studies
│ ├── partd/ # Medicare Part D evaluation sets (6–10 years)
│ └── eval/ # Cached MILP/RL evaluation results used for plots/tables
├── models/ # Trained RL policies: N=25,50,100 (best_model.pt + args.json)
├── scripts/ # Reproduction entry-points
├── artifacts/
│ ├── figures/ # Generated figures + supporting CSVs
│ ├── tables/ # Generated LaTeX/CSV tables
│ └── layouts/ # Period-by-period layout snapshots for atorvastatin
└── requirements.txt # Core Python dependencies
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt \
--extra-index-url https://download.pytorch.org/whl/cpuNote: The commands above install the CPU wheel of PyTorch. For GPUs install the matching CUDA build instead, for example:
python -m pip install torch --index-url https://download.pytorch.org/whl/cu128
Make sure the package is on PYTHONPATH when running the scripts. From the repository root this happens automatically because each script pre-pends src/ to sys.path.
All scripts default to reading cached data from data/ and writing outputs to artifacts/.
-
Generate synthetic training datasets (optional)
python scripts/generate_synthetic_data.py --dataset_size 2000 --graph_size 50 \ --pk 10 11 12 13 14 15 16 17 18 19 --distribution log-normal \ --data-dir data/syntheticThis mirrors the on-the-fly instance sampler used during training and can be customised via
--help. -
Train a policy from scratch
python scripts/train_policy.py \ --graph_size 50 \ --pk 10 11 12 13 14 15 16 17 18 19 \ --baseline critic \ --n_epochs 400 \ --output_dir runs/original_critic \ --no_tensorboardMirrors the actor–critic configuration used for RL(50). Expect a multi-hour run; checkpoints appear under
runs/PharmaSC_<graph_size>/<run_name>/.- Resume with
--resume path/to/epoch-XX.pt --epoch_start YY; avoidlatest_stable_checkpoint.ptbecause it does not encode the epoch number. - Reduce
--n_epochsor--epoch_sizefor quick smoke tests.
- Resume with
-
Tables (RL settings, MILP vs RL summary)
python scripts/build_tables.py
Produces
artifacts/tables/table_rl_settings.{csv,tex}andtable_results_summary.{csv,tex}. -
RL vs MILP comparison figures (6–10 year horizons)
python scripts/build_results_comparison.py
Generates
artifacts/figures/comparison_plot{6..10}.pngreplicating Figure 4 in the manuscript (cost scatter + runtime boxplots). -
Atorvastatin layout grid (RL vs CPLEX)
python scripts/build_layout_grid.py
Uses the prepared single-period snapshots under
artifacts/layouts/to composeartifacts/figures/layout_compare_atorvastatin.{png,pdf}. -
Resilience analysis (Monte Carlo + facility importance maps)
python scripts/build_resilience_analysis.py --device cpu
- Runs 10,000 Monte Carlo scenarios per stress level with the N=50 policy.
- Outputs
resilience_analysis.pngand the facility-importance maps/CSVs inartifacts/figures/. - Expect ~20 minutes on CPU; use
--scenarios <num>or--device cuda(for GPU run) to adjust runtime.
-
Full pipeline (runs every step in order)
python scripts/build_all.py
This simply executes the four commands above. Be aware that it inherits the heavy resilience run; supply a smaller scenario count via
build_resilience_analysis.pyif you are smoke-testing.
- Part D evaluation sets: synthetic demand trajectories (
data/partd/increasing_sequences_*_years_dataset.pkl). - Part D raw CSVs + converters: original Medicare Part D CSVs live in
data/partd/raw. Regenerate the cleaned demand CSV/JSON withpython scripts/part_D_data.py(writes todata/partd/processed_drug_data.csvanddata/partd/output_sequences_us50_dc/), then turn those JSONs into PKL datasets withpython scripts/real_data_from_json.py --json-dir data/partd/output_sequences_us50_dc --out-dir data/partd. - Evaluation caches: per-instance MILP/RL/heuristic metrics (
data/eval/results/*.csv) used to recreate the tables and scatter plots without re-solving MILPs. - Resilience assets: demand matrix for atorvastatin and the CONUS GeoJSON stored locally; all CSV exports for the paper are regenerated into
artifacts/figures/. - Policies: best checkpoints for N=25/50/100 along with training hyperparameters (
args.json). - Synthetic generator:
scripts/generate_synthetic_data.py/opdeeppharmasc.data.generationexposes the sampler used during RL training.
- The lightweight
opdeeppharmasc.utils.eval_solversmodule (undersrc/) keeps helper routines for running new RL inferences or, if Pyomo + commercial solvers are available, for regenerating MILP baselines from scratch. Configure your own solver credentials inGUROBI_WLS_OPTIONSbefore use. scripts/build_resilience_analysis.pyaccepts--importance-kand--topkto customise the facility-importance overlays.