numereng is a local-first workspace for Numerai model development. One repo, one CLI, and a read-only dashboard for training, experiments, ensembles, and submissions.
Built for Numerai participants who want an end-to-end local workflow for iterating on tournament models — experiment-centric, not just a CLI wrapper. numereng is pre-1.0 and community-built; the CLI is stable enough to build on but will continue to evolve.
numereng builds on the official Numerai Python packages numerapi and numerai-tools.
numereng is designed for agentic development. The CLI, typed Python API, and a library of agent skills give Claude Code, Codex, and similar agents a stable surface to drive the full experiment lifecycle end-to-end.
- Features
- Prerequisites
- Quick Start
- What You'll See
- I Want To…
- Your First Model
- Workspace Layout
- Python API
- Agent Skills
- Docs
- Contributing
- Project Notes
- Support, Security, License
- Local-first workspace. All runtime state lives under
.numereng/in your repo. No global daemon, no cloud account required. - Experiment-centric. Track configs, champions, reports, and full history under one experiment root.
- Hyperparameter optimization. Optuna-backed search over one config and its search space.
- Ensembles. Blend scored runs into one ranked prediction set.
- Feature neutralization. Apply feature-neutralization to any set of predictions.
- Serving and hosted uploads. Freeze production packages and push pickles to Numerai-hosted models.
- Remote and cloud training. SSH-driven remote workstations, EC2, and Modal for when local compute runs out.
- Read-only dashboard.
just vizgives you a mission-control UI over the current checkout. - Custom models. Drop model wrappers into
src/numereng/features/models/custom_models/and they are auto-discovered by the training pipeline. - Agent-extensible. Drop your own skills into
.codex/skills/so Claude Code, Codex, or similar agents can be pointed at workflows you author.
- Python 3.12+
uvpackage managergit- Node.js 20+ — required for
just viz(dashboard) - Numerai API credentials —
NUMERAI_PUBLIC_IDandNUMERAI_SECRET_KEYexported in your shell. Required for dataset, round, and submission operations. See Installation. - (Optional) Docker — only if building hosted Numerai pickle packages locally.
Clone the repo, install deps, initialize the local store, and launch the dashboard:
git clone https://github.com/dshap474/numereng.git
cd numereng
uv sync
uv run numereng store init
just viz- Dashboard UI: http://127.0.0.1:5173
- Backend API: http://127.0.0.1:8502
Contributors should use uv sync --extra dev, which adds test and lint tooling plus additional model backends. See CONTRIBUTING.md.
The dashboard is read-only: it surfaces experiments, runs, notes, and the embedded Numerai docs reader over your local workspace. See Dashboard & Monitor.
| Task | Command |
|---|---|
| Train one standalone model | uv run numereng run train --config <config-path> |
| Train inside a tracked experiment | uv run numereng experiment train ... |
| Compare configs in one experiment | uv run numereng experiment report --id <experiment-id> |
| Hyperparameter search (Optuna) | uv run numereng hpo create ... |
| Autonomous agent research loop | uv run numereng research init / run ... |
| Blend runs into an ensemble | uv run numereng ensemble build --run-ids ... |
| Feature-neutralize predictions | uv run numereng neutralize apply ... |
| Package a production model | uv run numereng serve package create ... |
| Upload a hosted Numerai pickle | uv run numereng serve pickle upload ... |
| Submit a round | uv run numereng run submit ... |
| Train on a remote machine over SSH | uv run numereng remote experiment launch ... |
| Train on EC2 / Modal | uv run numereng cloud ... |
| Monitor live state | just viz or uv run numereng monitor snapshot |
| Sync official Numerai docs locally | uv run numereng docs sync numerai |
| Scrape the Numerai forum | uv run numereng numerai forum scrape |
Before you start, populate the required Numerai datasets under .numereng/datasets/<data-version>/ — see Numerai Operations.
Create an experiment, train one config, inspect, and submit:
uv run numereng experiment create \
--id 2026-04-19_baseline \
--name "Baseline" \
--hypothesis "LGBM on v5.2 small features"
# Author .numereng/experiments/2026-04-19_baseline/configs/r1_baseline.json
# (see the full walkthrough for the config schema)
uv run numereng experiment train \
--id 2026-04-19_baseline \
--config .numereng/experiments/2026-04-19_baseline/configs/r1_baseline.json
uv run numereng experiment report --id 2026-04-19_baseline
uv run numereng run submit --model-name <model-name> --run-id <run-id>The experiment report starts with the high-level readout: current winner, resolved ambiguity, plateau rules, and the score-vs-scale chart for the runs in the experiment.
From there, Run Ops turns the same experiment into a dense comparison table. It keeps every run side by side, so you can sort by live metrics, inspect feature choices, and spot candidates worth opening.
Opening a run gives you the per-run drilldown: performance cards, cumulative diagnostics, artifacts, and lifecycle details in one place before you decide whether to keep iterating or submit.
The full walkthrough — including the config schema, scoring, and submission — is in docs/numereng/getting-started/first-model.md.
The repo checkout is the workspace. Runtime state is local and gitignored:
.numereng/
├── experiments/ # manifests, configs, reports, round-scored workflows
├── runs/ # run artifacts and scored outputs
├── datasets/ # Numerai datasets, baselines, downsampled variants
├── notes/ # research memory
├── cache/ # runtime caches (incl. pulled cloud archives)
├── tmp/ # managed scratch
├── remote_ops/ # remote orchestration state
└── numereng.db # SQLite store index
Extension and authoring roots:
src/numereng/features/models/custom_models/— drop in a custom model wrapper (auto-discovered)src/numereng/features/agentic_research/PROGRAM.md— tracked base agentic research programsrc/numereng/features/agentic_research/custom_programs/— local custom research programs (gitignored).codex/skills/— agent skills (shipped ones are tracked; add your own locally via the directory's gitignore allowlist)
For typed automation, the stable surface lives under numereng.api:
from numereng import api
from numereng.api.contracts import ExperimentReportRequest
report = api.experiment_report(
ExperimentReportRequest(experiment_id="2026-04-19_baseline", limit=5)
)See the Python API reference. For full local training orchestration, use numereng.api.pipeline.
numereng ships a library of user-invocable agent skills under .codex/skills/. Each is a self-contained SKILL.md you can point Claude Code, Codex, or a similar agent at to drive a specific workflow.
experiment-design— Plan and run numereng experiments: round design, scout-to-scale decisions, plateau logic, reporting, and champion handoff.numereng-experiment-ops— Source of truth for the numereng experiment contract: layout, config schema, templates, run artifacts, and valid CLI entrypoints.implement-custom-model— Add a custom model plugin undercustom_models/using the existing wrapper and factory patterns.store-ops— Safely maintain the local store: drift diagnosis, run cleanup and reset, reindex, and postcondition verification.numerai-api-ops— Run API-only Numerai operations throughnumerapiplus direct GraphQL helpers.
- Installation
- First Model
- Project Layout
- Dashboard & Monitor
- Custom Models
- Serving & Model Uploads
- Architecture
- Troubleshooting Runbooks
- Agent Usage Guide
To mirror the official Numerai docs locally (~500 files including images):
uv run numereng docs sync numeraiSee CONTRIBUTING.md.
numereng is distributed as a repo-clone workspace, not a PyPI package. Runtime state stays under gitignored paths (.numereng/, .env, real remote profile YAMLs). See Public Repo Boundary for the full contract and retained-corpus inventory.
- Questions and bugs: see SUPPORT.md and the repo's GitHub issues.
- Security: see SECURITY.md.
- Licensed under MIT.
Built by @dshap474. numereng is community-built and is not affiliated with, endorsed by, or supported by Numerai.





