numereng

numereng is a local-first workspace for Numerai model development. One repo, one CLI, and a read-only dashboard for training, experiments, ensembles, and submissions.

Built for Numerai participants who want an end-to-end local workflow for iterating on tournament models — experiment-centric, not just a CLI wrapper. numereng is pre-1.0 and community-built; the CLI is stable enough to build on but will continue to evolve.

numereng builds on the official Numerai Python packages numerapi and numerai-tools.

numereng is designed for agentic development. The CLI, typed Python API, and a library of agent skills give Claude Code, Codex, and similar agents a stable surface to drive the full experiment lifecycle end-to-end.

Features

Local-first workspace. All runtime state lives under .numereng/ in your repo. No global daemon, no cloud account required.
Experiment-centric. Track configs, champions, reports, and full history under one experiment root.
Hyperparameter optimization. Optuna-backed search over one config and its search space.
Ensembles. Blend scored runs into one ranked prediction set.
Feature neutralization. Apply feature-neutralization to any set of predictions.
Serving and hosted uploads. Freeze production packages and push pickles to Numerai-hosted models.
Remote and cloud training. SSH-driven remote workstations, EC2, and Modal for when local compute runs out.
Read-only dashboard. just viz gives you a mission-control UI over the current checkout.
Custom models. Drop model wrappers into src/numereng/features/models/custom_models/ and they are auto-discovered by the training pipeline.
Agent-extensible. Drop your own skills into .codex/skills/ so Claude Code, Codex, or similar agents can be pointed at workflows you author.

Prerequisites

Python 3.12+
uv package manager
git
Node.js 20+ — required for just viz (dashboard)
Numerai API credentials — NUMERAI_PUBLIC_ID and NUMERAI_SECRET_KEY exported in your shell. Required for dataset, round, and submission operations. See Installation.
(Optional) Docker — only if building hosted Numerai pickle packages locally.

Quick Start

Clone the repo, install deps, initialize the local store, and launch the dashboard:

git clone https://github.com/dshap474/numereng.git
cd numereng
uv sync
uv run numereng store init
just viz

Dashboard UI: http://127.0.0.1:5173
Backend API: http://127.0.0.1:8502

Contributors should use uv sync --extra dev, which adds test and lint tooling plus additional model backends. See CONTRIBUTING.md.

What You'll See

The dashboard is read-only: it surfaces experiments, runs, notes, and the embedded Numerai docs reader over your local workspace. See Dashboard & Monitor.

I Want To…

Task	Command
Train one standalone model	`uv run numereng run train --config <config-path>`
Train inside a tracked experiment	`uv run numereng experiment train ...`
Compare configs in one experiment	`uv run numereng experiment report --id <experiment-id>`
Hyperparameter search (Optuna)	`uv run numereng hpo create ...`
Autonomous agent research loop	`uv run numereng research init / run ...`
Blend runs into an ensemble	`uv run numereng ensemble build --run-ids ...`
Feature-neutralize predictions	`uv run numereng neutralize apply ...`
Package a production model	`uv run numereng serve package create ...`
Upload a hosted Numerai pickle	`uv run numereng serve pickle upload ...`
Submit a round	`uv run numereng run submit ...`
Train on a remote machine over SSH	`uv run numereng remote experiment launch ...`
Train on EC2 / Modal	`uv run numereng cloud ...`
Monitor live state	`just viz` or `uv run numereng monitor snapshot`
Sync official Numerai docs locally	`uv run numereng docs sync numerai`
Scrape the Numerai forum	`uv run numereng numerai forum scrape`

Your First Model

Before you start, populate the required Numerai datasets under .numereng/datasets/<data-version>/ — see Numerai Operations.

Create an experiment, train one config, inspect, and submit:

uv run numereng experiment create \
  --id 2026-04-19_baseline \
  --name "Baseline" \
  --hypothesis "LGBM on v5.2 small features"

# Author .numereng/experiments/2026-04-19_baseline/configs/r1_baseline.json
# (see the full walkthrough for the config schema)

uv run numereng experiment train \
  --id 2026-04-19_baseline \
  --config .numereng/experiments/2026-04-19_baseline/configs/r1_baseline.json

uv run numereng experiment report --id 2026-04-19_baseline
uv run numereng run submit --model-name <model-name> --run-id <run-id>

The experiment report starts with the high-level readout: current winner, resolved ambiguity, plateau rules, and the score-vs-scale chart for the runs in the experiment.

From there, Run Ops turns the same experiment into a dense comparison table. It keeps every run side by side, so you can sort by live metrics, inspect feature choices, and spot candidates worth opening.

Opening a run gives you the per-run drilldown: performance cards, cumulative diagnostics, artifacts, and lifecycle details in one place before you decide whether to keep iterating or submit.

The full walkthrough — including the config schema, scoring, and submission — is in docs/numereng/getting-started/first-model.md.

Workspace Layout

The repo checkout is the workspace. Runtime state is local and gitignored:

.numereng/
├── experiments/   # manifests, configs, reports, round-scored workflows
├── runs/          # run artifacts and scored outputs
├── datasets/      # Numerai datasets, baselines, downsampled variants
├── notes/         # research memory
├── cache/         # runtime caches (incl. pulled cloud archives)
├── tmp/           # managed scratch
├── remote_ops/    # remote orchestration state
└── numereng.db    # SQLite store index

Extension and authoring roots:

src/numereng/features/models/custom_models/ — drop in a custom model wrapper (auto-discovered)
src/numereng/features/agentic_research/PROGRAM.md — tracked base agentic research program
src/numereng/features/agentic_research/custom_programs/ — local custom research programs (gitignored)
.codex/skills/ — agent skills (shipped ones are tracked; add your own locally via the directory's gitignore allowlist)

Python API

For typed automation, the stable surface lives under numereng.api:

from numereng import api
from numereng.api.contracts import ExperimentReportRequest

report = api.experiment_report(
    ExperimentReportRequest(experiment_id="2026-04-19_baseline", limit=5)
)

See the Python API reference. For full local training orchestration, use numereng.api.pipeline.

Agent Skills

numereng ships a library of user-invocable agent skills under .codex/skills/. Each is a self-contained SKILL.md you can point Claude Code, Codex, or a similar agent at to drive a specific workflow.

experiment-design — Plan and run numereng experiments: round design, scout-to-scale decisions, plateau logic, reporting, and champion handoff.
numereng-experiment-ops — Source of truth for the numereng experiment contract: layout, config schema, templates, run artifacts, and valid CLI entrypoints.
implement-custom-model — Add a custom model plugin under custom_models/ using the existing wrapper and factory patterns.
store-ops — Safely maintain the local store: drift diagnosis, run cleanup and reset, reindex, and postcondition verification.
numerai-api-ops — Run API-only Numerai operations through numerapi plus direct GraphQL helpers.

Docs

To mirror the official Numerai docs locally (~500 files including images):

uv run numereng docs sync numerai

Contributing

See CONTRIBUTING.md.

Project Notes

numereng is distributed as a repo-clone workspace, not a PyPI package. Runtime state stays under gitignored paths (.numereng/, .env, real remote profile YAMLs). See Public Repo Boundary for the full contract and retained-corpus inventory.

Support, Security, License

Questions and bugs: see SUPPORT.md and the repo's GitHub issues.
Security: see SECURITY.md.
Licensed under MIT.

Built by @dshap474. numereng is community-built and is not affiliated with, endorsed by, or supported by Numerai.

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
.codex		.codex
.github		.github
docker		docker
docs		docs
scripts		scripts
src/numereng		src/numereng
tests		tests
viz		viz
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.importlinter		.importlinter
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
justfile		justfile
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

numereng

Contents

Features

Prerequisites

Quick Start

What You'll See

I Want To…

Your First Model

Workspace Layout

Python API

Agent Skills

Docs

Contributing

Project Notes

Support, Security, License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

numereng

Contents

Features

Prerequisites

Quick Start

What You'll See

I Want To…

Your First Model

Workspace Layout

Python API

Agent Skills

Docs

Contributing

Project Notes

Support, Security, License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages