reader is a workbench for experimental data, where the unit of work is an experimental directory: you put raw inputs there, keep notebooks next to them, and write (outputs/) in the same place. reader includes a plugin-based pipeline runner (see docs/pipeline.md and docs/plugins.md), but more broadly these workspaces are where you can iterate on one experiment with a mix of:
- lightweight utilities (
ls,run, …) - repeatable steps (via
config.yaml+ CLI) - exploratory work (via marimo notebooks)
- Repo layout
- Installation
- Quickstart
- CLI workbench commands
- Running notebooks
- Maintaining dependencies
- Upgrading dependencies
reader is basically two things: a place for experiments, and a library/CLI that helps you analyze them.
reader/
├─ experiments/ # experiment workbench (configs + data + results)
│ └─ exp_001/
│ ├─ config.yaml # pipeline spec (steps: ingest→merge→transform→plot)
│ ├─ raw_data/ # instrument exports (or raw/)
│ ├─ notebooks/ # optional: marimo notebooks for this experiment
│ └─ outputs/ # generated: artifacts/, plots/, manifest.json, reader.log
│
└─ src/
└─ reader/ # optional: library for running config.yaml-driven workflows across experiments
├─ core/ # shared commands: run/explain/validate/ls/artifacts
├─ io/ # implement an instrument parser once (raw → tidy), reuse it across experiments
├─ plugins/ # thin adapters that expose io/lib operations to config.yaml steps
│ ├─ ingest/ # raw → tidy artifacts (canonical tables)
│ ├─ merge/ # tidy + metadata/table joins
│ ├─ transform/ # reusable cleanups + derived columns/channels
│ ├─ plot/ # optional shared plotting steps
│ └─ validator/ # optional schema gates / normalizers (coercion, shape checks)
├─ lib/ # reusable domain helpers (imported by plugins + notebooks)
└─ tests/This repo is managed with uv:
pyproject.tomldeclares dependencies (runtime + optional extras).uv.lockis the fully pinned dependency graph..venv/is the project virtual environment.
Two key commands:
uv syncinstalls everything from the lockfile into.venv.uv run <cmd>runs commands inside the project environment without requiringsource .venv/bin/activate.
-
Install uv. Below is for macOS/Linux (for other OSs see here).
curl -LsSf https://astral.sh/uv/install.sh | sh # ensure your uv bin dir is on PATH
-
Clone repo
git clone https://github.com/e-south/reader.git cd reader -
Create/sync the environment from the committed lockfile:
uv sync --locked
-
Sanity checks:
uv run python -c "import reader, pandas, pyarrow; print('ok')" -
Dev tooling is opt-in via a dependency group.
uv sync --locked --group dev --group notebooks uv run ruff --version uv run ruff check . uv run pytest -q
Option A: no .venv activation — use uv run
uv run reader --help
uv run reader lsOption B: traditional — activate .venv
source .venv/bin/activate
reader --help
reader ls
deactivatemkdir -p experiments/my_experiment/{raw_data,notebooks,outputs}Drop experimental data into raw_data/.
If the experiment has a config.yaml:
uv run reader explain experiments/my_experiment/config.yaml
uv run reader run experiments/my_experiment/config.yamlThe CLI is a set of helpers for the workspace.
reader ls --root experiments
reader plugins
reader steps <CONFIG or INDEX>
reader artifacts <CONFIG or INDEX>Common workflow helpers:
reader ls [--root DIR]— list experiments (finds**/config.yaml).reader explain CONFIG|DIR|INDEX— show the plan (what would run).reader validate CONFIG|DIR|INDEX— validate the config + plugin configs.reader run CONFIG|DIR|INDEX [--step STEP] [--resume-from ID] [--until ID]— execute (sliceable).reader run-step STEP --config CONFIG|DIR|INDEX— run exactly one step using existing artifacts.reader artifacts CONFIG|DIR|INDEX— list the latest artifact locations (manifest-backed).reader plugins— show discovered plugins (built-ins + entry points).
Note: ls only lists experiments that have a config.yaml. Notebooks can exist without configs,
but then they won’t be discoverable via ls.
There are two practical modes:
uv sync --locked --group notebooks
uv run marimo edit notebook.pyThis runs marimo inside your project environment, so it can import reader and anything in uv.lock.
Marimo can manage per-notebook sandbox environments using inline metadata. This is great for sharable notebooks.
-
Create/edit a sandbox notebook (marimo installed temporarily via uvx).
uvx marimo edit --sandbox notebook.py
-
Run a sandbox notebook as a script.
uv run notebook.py
-
Make the sandbox notebook use your local
readerrepo in editable mode.From the repo root:
uv add --script path/to/notebook.py . --editableThis writes inline metadata into the notebook so its sandbox can install reader from your local checkout in editable mode.
-
Add/remove sandbox dependencies (only affects the notebook file)
uv add --script notebook.py numpy uv remove --script notebook.py numpy
Note: You can also run claude code/codex in the terminal and ask it to edit a marimo notebook on your behalf. Make sure that you run your notebook with the watch flag turned on, like
marimo edit --watch notebook.py, to see updates appear live whenever agents makes a change.
If you want to change dependencies, prefer uv add / uv remove:
-
Add a runtime dependency:
uv add <package>
-
Add to a dependency group:
uv add --group dev <package>
-
Remove:
uv remove <package>
Then commit pyproject.toml + uv.lock.
If you edit pyproject.toml by hand, regenerate uv.lock:
uv lockNew users should then run:
uv sync --lockeddnadesign is pulled from GitHub and pinned to a specific commit in uv.lock. If dnadesign/main changes, a plain uv sync will keep using the locked commit until you explicitly upgrade it (see Astral Docs).
To pull the latest dnadesign and refresh your environment:
uv sync --upgrade-package dnadesignThis will update uv.lock** (bumping the pinned commit SHA for dnadesign) and then re-sync .venv to match. Commit the updated uv.lock so everyone gets the same version.
@e-south