Skip to content

cycling-data-lab/mobility-applicability-bound

Repository files navigation

mobility-applicability-bound

License: MIT Python 3.12+ Status: outline DOI Sibling theorem

Manuscript (outline): Why mode-choice models trained in city A fail in city B: a structural lower bound on transferability. Rohan Fossé and Gaël Pallares, CESI LINEACT, 2026. In preparation for submission to Transportation Research Part B: Methodological (TR-B).

This repository develops the mobility-side empirical instantiation of the structural lower bound proved in the sibling repository materials-applicability-bound (MLST submission, Zenodo DOI 10.5281/zenodo.20355996). Where the materials paper proves the theorem and validates it on the 8-task MatBench panel, this paper applies the same theorem to the leave-commune-out transferability of mode-choice models across 34,858 French communes.

The result is the first regressor-free structural lower bound on the transferability gap in transport mode-choice modelling. Existing transferability heuristics in the transport literature — predictive RMSE on held-out cities, Hosmer–Lemeshow goodness-of-fit, percent correctly predicted (PCP) — are all empirical, regressor-dependent and silent on whether the gap is fixable by a better model or structurally bounded below.

The puzzle

Operational mobility forecasting routinely faces the question : will a mode-choice model fitted on city A also work on city B? The transport literature documents the empirical failure of cross-city transferability since at least Atherton & Ben-Akiva (1976), with the central methodological proposals — joint estimation, post-stratification, Bayesian updating, transfer scaling — all empirical heuristics that leave one essential question unanswered :

Given a target city B that I have not sampled, can I bound below the irreducible prediction error of any model trained on the cities I have sampled, using only the structural properties of the inter-commune similarity graph and the mode-share signal — without training a single model?

Theorem 1 of [Fossé & Pallares 2026, MLST, in preparation] says yes : the structural lower bound is $\Gamma(G, \mathbf{y}) = R^2_{\mathrm{spec}}(\mathcal{S}{\mathrm{comp}}, \mathbf{y}) \cdot \mathrm{Var}(\mathbf{y})$, where $G$ is the commune-similarity graph (built from IMD-4 + INSEE socio-economic features), $\mathbf{y}$ is the mode-share vector across communes, and $R^2{\mathrm{spec}}$ is the projection $R^2$ of $\mathbf{y}$ on the compositional subspace $\mathcal{S}_{\mathrm{comp}}$. The bound is regressor-independent, computable in seconds from the graph and the signal alone, and admits a three-way correspondence with the in-sample OLS $R^2$, the bandlimited Fourier energy of $\mathbf{y}$ on the graph Laplacian eigenbasis, and the $K$-iteration Weisfeiler–Lehman expressivity ceiling on message-passing GNN models of mode choice.

This paper develops the empirical instantiation of that theorem on French commune-level mobility data, demonstrates the operating regimes empirically, and connects the bound to the penality-analysis cycling-poverty diagnostic : communes in the do-not-deploy regime are provably non-predictable under structural conditions, which is independent justification for the Plan Vélo priority listing.

Headline result targets (to be measured)

Statistic Target value Status
Spearman ρ(R²_spec, ΔR²_LCO) across French regions (n ≈ 13) $\geq +0.7$, exact $p \leq 0.05$ pending
Compositional Information Gain on the commune graph $\geq 30\times$ above topology-matched null pending
Encoder discrimination : IMD-4 vs satellite-tile CLIP embeddings $\Delta\epsilon_K \leq -0.1$ pending
Cycling-poverty desert overlap : do-not-deploy regime ∩ §penality diagnostic $\geq 80%$ pending

Why TR-B

Transportation Research Part B: Methodological is the journal for theoretical contributions in transport. The fit is precise :

  • TR-B values rigorous mathematical statements (RUM theory, discrete choice, network science).
  • The transferability question is a TR-B problem since the 1970s (Atherton, Ben-Akiva, Train, Koppelman, …).
  • Our contribution is structurally complementary to existing transferability heuristics (Sikder & Pinjari 2013, Bowman & Bradley 2017) : a lower bound, regressor-free, computable a-priori.

What's in here

mobility-applicability-bound/
├── paper.tex                     # Main manuscript (iopjournal class for drafting; switch to elsarticle for TR-B submission)
├── paper_si.tex                  # Supplementary Information
├── cover_letter.md               # TR-B cover letter draft
├── iopjournal.cls + orcid.pdf    # IOP class assets (drafting only)
├── .zenodo.json + CITATION.cff   # Citation metadata
├── references/references.bib     # Bibliography (includes pre-cited Fossé & Pallares 2026 MLST)
├── experiments/
│   ├── d01_mode_choice_pilot.py    # Pilot on a small commune subset
│   ├── d02_imd4_load.py            # Load IMD-4 panel from imd-national-catalogue
│   ├── d03_emp_mode_share.py       # INSEE EMP mode-share extraction
│   ├── d04_commune_graph.py        # Build commune-similarity graph from IMD-4 + INSEE
│   ├── d05_rspec_per_region.py     # R²_spec across 13 French regions
│   ├── d06_lco_predictive.py       # Empirical ΔR²_LCO under leave-commune-out CV
│   ├── d07_polymorphism_eve.py     # Eve's law decomposition (mode-share variance)
│   ├── d08_falsifiability.py       # Shuffled-IMD4 null + CIG
│   ├── d09_encoder_discrim.py      # IMD-4 vs satellite-CLIP oracle test
│   ├── d10_penality_overlap.py     # Do-not-deploy ∩ cycling-poverty deserts overlap
│   └── _plot_style.py
├── figures/
├── outputs/
└── drafts/

Reproducing the paper

# Drafting compile (iopjournal class)
pdflatex paper.tex && bibtex paper && pdflatex paper.tex && pdflatex paper.tex

# Pilot
python3.12 experiments/d01_mode_choice_pilot.py

Data sources

  • INSEE EMP 2018–2019 (mobility survey, Licence Ouverte 2.0) : mode shares per commune.
  • IMD-4 catalogue (sibling repo imd-national-catalogue) : 13-dim cycling-environment indicator on 34,858 French communes.
  • INSEE Filosofi (Licence Ouverte 2.0) : socio-economic features (median income, poverty rate, age structure).
  • Cerema infrastructure inventory : transit + road network density per commune.
  • OpenStreetMap (OdbL) : satellite-tile features for the encoder-discrimination oracle test (via SatMAE / CLIP-based embeddings).

Sibling repos

How to cite

A machine-readable citation is provided in CITATION.cff. Plain BibTeX :

@unpublished{FossePallares2026mobilityApplicabilityBound,
  author = {Foss\'e, Rohan and Pallares, Ga\"el},
  title  = {Why mode-choice models trained in city A fail in city B:
            a structural lower bound on transferability},
  note   = {Manuscript in preparation for {Transportation Research B},
            CESI LINEACT, 2026.
            \url{https://github.com/cycling-data-lab/mobility-applicability-bound}},
  year   = {2026}
}

License

MIT.

Contact

Rohan Fossé — rfosse@cesi.frORCID Gaël Pallares — ORCID

About

Why mode-choice models trained in city A fail in city B: a structural lower bound on transferability. Empirical instantiation, on the 34,858 French commune mobility panel, of Theorem 1 of the sibling materials-applicability-bound paper. Target: Transportation Research Part B (TR-B).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors