ballpushing_utils

Analysis library and figure-reproduction code for the Drosophila ball-pushing paradigm developed in the Ramdya Lab at EPFL. It computes behavioural metrics from SLEAP-tracked recordings of flies interacting with a ball in a corridor, and contains the scripts that generate every panel in the paper companion of this repository.

Paper citation: TODO — paste BioRxiv / journal DOI when available (Durrieu et al., 2026, "Object manipulation and affordance learning in Drosophila").

Dataset: TODO — paste Harvard Dataverse DOI / URL once published. The dataverse hosts both the raw HDF5 SLEAP tracks and the pooled per-fly summary feathers used by the figure scripts. You can reproduce the paper figures from the feathers alone (no SLEAP re-processing required).

What's in here

ballpushing_utils/
├── src/ballpushing_utils/      # The Python library (pip-installable).
│   ├── plotting/               # Shared figure helpers (rcParams, sig bars,
│   │                           #   cm-based axis sizing, paired boxplots).
│   ├── stats/                  # Permutation test, bootstrap CI, Cohen's d.
│   ├── ballpushing_metrics.py  # Per-fly metric definitions (see below).
│   ├── dataset.py              # Dataset loader / pooler.
│   ├── experiment.py, fly.py   # Domain objects (Experiment > Fly).
│   ├── fly_trackingdata.py     # SLEAP track wrapper.
│   ├── paths.py                # Data/figure path helpers (env-var driven).
│   └── ...
├── src/Screen_analysis/        # Brain-region screen analysis pipeline.
├── figures/                    # Paper figure scripts (Fig. 1 – Fig. 3 +
│   ├── Fig1-setup/             #   ED Fig. 6). One script per panel; each
│   ├── Fig2-Affordance/        #   reads a feather, runs stats, writes a
│   ├── Fig3-Screen/            #   PDF + a stats CSV.
│   └── EDFigure6-Dendrogram/
├── plots/                      # Exploratory + supplementary plots
│   ├── Ballpushing_PR/         #   (feeding state, wildtype push rate,
│   ├── F1_tracks/              #   F1 paradigm, ball scents, etc.).
│   └── Supplementary_exps/     # Supplementary-figure scripts.
├── experiments_yaml/           # YAML descriptors of every experiment
│                               #   batch (genotype, replicate dates, ...).
├── notebooks/                  # Jupyter walkthroughs (Fly/Experiment/
│                               #   Dataset tour + diagnostics demo).
├── tools/                      # CLI / dashboard entry points
│                               #   (e.g. tools/diagnostics_dashboard.py).
├── tests/                      # pytest suite.
├── run_all_figures.py          # Run every script under figures/.
├── pyproject.toml              # Package + dev-tool config.
└── .env.example                # Template for local data/figure paths.

A complete description of every per-fly metric (interaction events, significant pushes, "aha moment", chamber/corridor metrics, leg visibility, etc.) lives in src/ballpushing_utils/README_Ballpushing_metrics.md.

Installation

Requires Python ≥ 3.10.

git clone https://github.com/<TODO-org>/ballpushing_utils.git
cd ballpushing_utils

# Create an environment (conda or venv — either is fine).
python -m venv .venv
source .venv/bin/activate

# Install the package in editable mode.
pip install -e .

# Optional extras:
pip install -e ".[interactive]"   # bokeh / panel / shiny dashboards
pip install -e ".[video]"         # moviepy / pygame for video overlays
pip install -e ".[dev]"           # pytest, black, ruff
pip install -e ".[all]"           # everything

ballpushing_utils depends on utils_behavior, the lab's general-purpose behavioural-analysis utilities. It is declared as a PyPI dependency in pyproject.toml; if your environment cannot resolve it from PyPI, install it from source first.

Configuration: data & figure paths

All scripts resolve dataset paths relative to BALLPUSHING_DATA_ROOT and write outputs under BALLPUSHING_FIGURES_ROOT. Set them however you prefer — .env, shell export, or your launcher of choice.

Copy the template and edit:

cp .env.example .env
$EDITOR .env

# .env
BALLPUSHING_DATA_ROOT=/path/to/dataverse/download
BALLPUSHING_FIGURES_ROOT=/path/where/figures/should/land

To pick up .env automatically inside Python:

from ballpushing_utils import load_dotenv
load_dotenv()  # reads ./.env

If unset, BALLPUSHING_DATA_ROOT defaults to the EPFL lab share (/mnt/upramdya_data/MD) and BALLPUSHING_FIGURES_ROOT defaults to <data root>/Affordance_Figures.

The dataverse archive mirrors the layout the scripts expect, so after unziping the bundle into BALLPUSHING_DATA_ROOT you can run any figure script unmodified.

Quickstart

Reproduce a single paper panel from the dataverse feathers

# Once (after editing .env)
export $(grep -v '^#' .env | xargs)

python figures/Fig2-Affordance/fig2_magnetblock_first_major_push_time.py
# -> writes  $BALLPUSHING_FIGURES_ROOT/Figure2/<script-stem>/*.pdf
#         + a *_stats.csv with the published p-value alongside it.

Each script accepts --test to run on a 200-row subsample for a quick smoke test.

Reproduce every paper panel in one shot

python run_all_figures.py

This auto-discovers all *.py under figures/, runs each in its own subprocess, and prints a green/red pass-fail summary. Figures land under BALLPUSHING_FIGURES_ROOT.

Use the library on your own recordings

For a guided tour of how Fly, Experiment, and Dataset bind tracking data, metadata, and config together — with runnable cells against a real fly folder — start with notebooks/ballpushing_utils_walkthrough.ipynb. Two companion notebooks drill into the two things most users want next:

notebooks/ballpushing_metrics_reference.ipynb is a live, per-metric reference paired with src/ballpushing_utils/README_Ballpushing_metrics.md — every metric in fly.event_summaries is printed with its current value and a one-line description.
notebooks/dataset_types_guide.ipynb tours every dataset_type you can request (summary, coordinates, fly_positions, event_metrics, F1_coordinates, F1_checkpoints, contact_data, Skeleton_contacts, standardized_contacts, transformed, transposed, behavior_umap) with subsections for the preconditions (F1 experiment type, skeleton tracks, Learning paradigm, …).

A quick taster:

from ballpushing_utils import Experiment

# Point at a folder containing one experiment (multiple arenas of flies).
exp = Experiment("/path/to/experiment_directory")

for fly in exp.flies:
    # Each metric family is a cached dict on the Fly object — touching the
    # property triggers the computation the first time and caches it.
    summaries = fly.event_summaries        # ball-pushing summary metrics
    print(fly.metadata.name, summaries.get("first_major_event_time"))

Other metric families exposed on Fly: event_metrics (per-event tables), f1_metrics (F1-paradigm only), learning_metrics, and the underlying tracking_data (a FlyTrackingData). See src/ballpushing_utils/README_Ballpushing_metrics.md for the full metric reference.

A worked example for a single panel using the shared plotting/stats helpers:

import pandas as pd
import matplotlib.pyplot as plt
from ballpushing_utils import dataset, figure_output_dir
from ballpushing_utils.plotting import (
    paired_boxplot_with_significance, resize_axes_cm, set_illustrator_style,
)
from ballpushing_utils.stats import permutation_test

set_illustrator_style()
df = pd.read_feather(dataset("MagnetBlock/.../pooled_summary.feather"))
control = df.loc[df.Magnet == "n", "first_major_event_time"].to_numpy()
test    = df.loc[df.Magnet == "y", "first_major_event_time"].to_numpy()

perm = permutation_test(control, test, statistic="median", n_permutations=10_000)

fig, ax = plt.subplots()
paired_boxplot_with_significance(ax, [control, test], p_value=perm.p_value)
resize_axes_cm(fig, ax, width_cm=1.75, height_cm=2.25)
fig.savefig(figure_output_dir("MyFig", __file__) / "panel.pdf", dpi=300)

The permutation test seeds the legacy NumPy RandomState(42) so the p-values it returns match the published values bit-for-bit.

Diagnostics

When a recording looks wrong (events mis-classified, metrics out of range, NaNs appearing) start with the diagnostics layer. The builders under ballpushing_utils.diagnostics return plain DataFrames + matplotlib.Figures, so they're equally at home in a script, a notebook, or a dashboard:

notebooks/diagnostics_demo.ipynb walks through build_event_timeline and build_metric_report against a stub fly — runs offline so it doubles as a smoke test for new installs.
python tools/diagnostics_dashboard.py <fly_path> serves an interactive Panel app with the event table, the Gantt-style timeline (thresholds tweakable via sliders), and the metric-range report.
write_report(...) materialises any report into a per-run folder with summary.md, per-section CSVs, and plots/*.png.

The hermetic invariants of these builders are locked down in tests/unit/diagnostics/, which is what CI runs on every push.

Figure ↔ script mapping

Paper figure	Script(s)	Reads
Fig. 1 — setup & wild-type baseline	`figures/Fig1-setup/plot_wildtype_trajectories.py` `figures/Fig1-setup/plot_simulation_trajectories.py` `figures/Fig1-setup/learning_trials_duration.py` `figures/Fig1-setup/compute_distribution_stats.py`	wild-type trajectory + summary feathers
Fig. 2 — affordance (MagnetBlock + F1)	`figures/Fig2-Affordance/fig2_magnetblock_first_major_push_time.py` `figures/Fig2-Affordance/fig2_magnetblock_first_major_push_index.py` `figures/Fig2-Affordance/plot_magnetblock_trajectories.py` `figures/Fig2-Affordance/fig2_f1_control_conditions.py` `figures/Fig2-Affordance/fig2_f1_heatmaps_pretraining.py`	MagnetBlock pooled summary; F1 pre-training datasets
Fig. 3 — neural silencing screen	`figures/Fig3-Screen/fig3_screen_heatmap.py` `figures/Fig3-Screen/fig3_f1_tnt.py`	TNT screen + F1-TNT pooled summaries
ED Fig. 6 — behavioural dendrogram	`figures/EDFigure6-Dendrogram/edfigure6_dendrogram.py`	wild-type metric matrix

Supplementary panels (feeding-state, ball scents, ball types, dark olfaction, learning mutants, broad TNT screen, etc.) live under plots/Supplementary_exps/ and plots/Ballpushing_PR/. They follow the same script.py → PDF + stats.csv convention as the figure scripts.

Building feathers from raw H5 SLEAP tracks

The dataverse exposes the per-fly summary feathers used by every figure script, so most users will never need this step. If you do want to re-process raw tracks, the pipeline is:

Drop SLEAP .h5 exports under $BALLPUSHING_DATA_ROOT/<experiment>/.
Describe the experiment batch in a YAML file under experiments_yaml/ (genotypes, replicate dates, conditions). See any of the existing files for the schema.
Build the dataset: python src/dataset_builder.py <yaml> produces per-fly metric tables.
Pool feathers: python src/pool_feather_files.py concatenates per-experiment feathers into the pooled_summary.feather files the figure scripts read.
Run the figure scripts as usual.

Tests

pip install -e ".[dev]"
pytest tests/unit         # hermetic suite — no SLEAP data required
pytest tests/integration  # integration suite — needs BALLPUSHING_DATA_ROOT

tests/unit/ runs against stub flies/experiments and is what CI executes on every push (see .github/workflows/tests.yml). It covers the diagnostics builders (ballpushing_utils.diagnostics.{event_timeline,metric_report,report}) and the reproducibility contracts of the permutation test (ballpushing_utils.stats.permutation_test, both the legacy RandomState / median path and the screen-panel default_rng / mean / plus_one path).

tests/integration/ is currently mid-triage — see tests/integration/REVIEW.md for the per-file plan.

Configuration lives in pyproject.toml under [tool.pytest.ini_options].

Development

The project uses Black, Ruff, and pytest, all configured in pyproject.toml. Recommended workflow:

black src tests figures
ruff check src tests figures
pytest

Hardcoded data paths in any new script will fail review — always go through ballpushing_utils.paths.dataset(...) and ballpushing_utils.paths.figure_output_dir(...).

License & citation

Please cite the paper above when using the library, the metrics, or the dataset in your work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ballpushing_utils

What's in here

Installation

Configuration: data & figure paths

Quickstart

Reproduce a single paper panel from the dataverse feathers

Reproduce every paper panel in one shot

Use the library on your own recordings

Diagnostics

Figure ↔ script mapping

Building feathers from raw H5 SLEAP tracks

Tests

Development

License & citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 191 Commits
.github/workflows		.github/workflows
.vscode		.vscode
diagnostics		diagnostics
docs		docs
experiments_yaml		experiments_yaml
figures		figures
notebooks		notebooks
plots		plots
src		src
tests		tests
tools		tools
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_all_figures.py		run_all_figures.py

Folders and files

Latest commit

History

Repository files navigation

ballpushing_utils

What's in here

Installation

Configuration: data & figure paths

Quickstart

Reproduce a single paper panel from the dataverse feathers

Reproduce every paper panel in one shot

Use the library on your own recordings

Diagnostics

Figure ↔ script mapping

Building feathers from raw H5 SLEAP tracks

Tests

Development

License & citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages