Helix Context

Coordinate-index engine for LLM agents. Retrieves, weighs, and compresses your codebase into a context window — without a single LLM call on the retrieval path.

Proof (30 seconds)

WIP benchmark numbers — compressor disabled (default LLM-free config), N=15 query shapes, May 2026:

metric	tokens	vs standard RAG (top-5 @ 1500)
median	2,757	2.9× fewer tokens
best (focused query)	1,410	5.7×
worst (broad 12-doc)	3,755	2.1×

With the optional compressor enabled (Claude Haiku splice), median improves to ~5×. In multi-turn sessions, the session delivery register elides already-seen documents — observed 37× reduction on repeated retrievals within a conversation.

Reproducer: python benchmarks/bench_rag_vs_sike_tokens.py against your own genome.

Agent contract: every /context response carries know { found, confidence } (grounded — you may answer) or miss { reason, escalate_to } (not found — don't answer from genome). Stale results downgrade to miss(reason="stale"|"cold"|"superseded") via the freshness gate.

Get started (60 seconds)

# 1. Install
pip install helix-context
python -m spacy download en_core_web_sm

# 2. Ingest your codebase
helix ingest path/to/your/project/ --recursive

# 3. Query it
helix query "how does the splice step work?"

# 4. Or start the proxy for IDE integration
helix-server   # binds to 127.0.0.1:11437

For extras matrix, BGE-M3 backfill, and tray setup: docs/SETUP.md.

Agent surfaces

Three ways to drive Helix — same retrieval primitives, same JSON shapes:

Surface	Best for	Example
CLI	Scripts, CI, cold-start agents	`helix query "..." --json`
MCP	Claude Code, Cursor, Claude Desktop	Add to `settings.json`
HTTP proxy	Continue IDE, `OPENAI_BASE_URL` redirect	`POST /context`

# CLI — no server, no daemon, subprocess-drivable
helix query    "what does the splice step do?" --json
helix packet   "edit the splice step" --task-type edit --json
helix gene get abc123 --json
helix neighbors "splice step" --k 10 --json
helix refresh-targets "edit the splice step" --json
helix status
helix diag corpus

Full CLI reference: docs/clients/cli.md. MCP tool schemas: docs/api/mcp-tools.md.

Pipeline (2 minutes)

Seven stages per turn, all LLM-free except optional splice:

  query
    │
    ▼
┌──────────────┐
│ 0. Classify  │  rule-based: decoder mode + assembly cap
└──────┬───────┘
       ▼
┌──────────────┐
│ 1. Extract   │  heuristic keyword + entity extraction
└──────┬───────┘
       ▼
┌──────────────┐  FTS5 BM25 + BGE-M3 dense (1024-dim) + tags
│ 2. Retrieve  │  + synonym expansion + co-activation + SR
│              │  ranked via RRF or additive fusion
└──────┬───────┘
       ▼
┌──────────────┐
│ 3. Re-rank   │  CPU classifier scores (optional)
└──────┬───────┘
       ▼
┌──────────────┐
│ 4. Splice    │  Headroom Kompress (CPU) or LLM compressor
└──────┬───────┘
       ▼
┌──────────────┐  token budget + legibility headers (fired tiers,
│ 5. Assemble  │  confidence ◆/◇/⬦, compression ratio) +
│   + Stage 7  │  freshness gate (stale/cold/superseded → miss)
└──────┬───────┘  + session delivery (elide already-seen docs)
       ▼
┌──────────────┐
│ 6. Persist   │  query+response → knowledge store (background)
└──────┘───────┘
       ▼
   know { } or miss { }

know/miss contract: know means the context is grounded, agent may answer. miss means don't answer from genome — escalate via escalate_to tools or refetch from refresh_targets.
Caller model class: /context accepts caller_model_class: "generic" | "small_moe" | "frontier" to select render branch (ordering, assembly cap, decoder mode). See docs/api/context-endpoint.md §7.

Configuration (17 sections in helix.toml)

Section	Key settings
`[ribosome]`	`enabled`, `backend` (`"none"` / `"litellm"` / `"claude"` / `"deberta"`), query_expansion
`[hardware]`	Device auto-detection (CUDA → ROCm → MPS → CPU)
`[budget]`	`expression_tokens` (7k default), `max_genes_per_turn`, splice_aggressiveness, `legibility_enabled`, `session_delivery_enabled`
`[session]`	Synthetic session windows, default party_id
`[genome]`	`path` (`genomes/main/genome.db`), compact_interval, replicas
`[server]`	host, port, upstream
`[headroom]`	Optional Headroom proxy lifecycle
`[ingestion]`	`backend` (`"cpu"` / `"ollama"`), splade_enabled, entity_graph
`[context]`	Cold-tier retrieval: enabled, k, min_cosine
`[cymatics]`	Frequency-domain scoring, harmonic_links, distance_metric
`[classifier]`	Rule-based query classification thresholds
`[retrieval]`	`fusion_mode` (`"additive"` / `"rrf"`), SR, ray_trace_theta, seeded_edges
`[plr]`	Piecewise linear reranker model
`[know]`	Know/miss calibration: confidence_floor, margin_threshold
`[mem_sync]`	Auto-memory → helix sync: watch_dirs, interval
`[synonyms]`	Query expansion map (e.g., "cache" → ["redis", "ttl"])
`[abstain]`	Low-confidence abstention thresholds

Full reference: docs/config-reference.md.

Full endpoint reference

Core retrieval:

Endpoint	Purpose
`POST /context`	know/miss + expressed_context (primary)
`POST /context/packet`	Agent-safe bundle: verified / stale_risk / refresh_targets
`POST /context/refresh-plan`	Refresh targets only (reread plan)
`POST /fingerprint`	Navigation-first payload (scores, no body)
`GET /context/expand`	1-hop neighborhood from a gene_id
`POST /v1/chat/completions`	OpenAI-compatible proxy

Ingestion + maintenance:

Endpoint	Purpose
`POST /ingest`	Add content to the knowledge store
`POST /consolidate`	Rewrite stale docs from source fingerprints
`POST /admin/refresh`	Force retrieval-layer refresh
`POST /admin/vacuum`	Reclaim SQLite pages
`POST /admin/swap-db`	Hot-swap the .db file without restart

Identity + sessions:

Endpoint	Purpose
`POST /sessions/register`	Register agent participant
`GET /sessions`	List registered participants
`GET /session/{id}/manifest`	Session delivery log
`POST /hitl/emit`	Record HITL pause event

Diagnostics:

Endpoint	Purpose
`GET /stats`	Corpus metrics + compression ratio
`GET /health`	Model, doc count, calibration provenance
`GET /genes/{gene_id}`	Single document detail
`GET /debug/resonance`	Tier activation profile
`GET /metrics/tokens`	Token usage counters

Full schema: docs/api/endpoints.md.

Package structure (16 packages, post-PR #90)

Package	Purpose
`adapters/`	Cache, DAL, external retriever protocol
`backends/`	Compressor, BGE-M3 codec, DeBERTa, NLI, SEMA, SPLADE
`cli/`	`helix` CLI: query, packet, gene, neighbors, ingest, diag, config, status
`encoding/`	Chunking, fragments, legibility headers, Headroom bridge
`identity/`	CWoLa logger, session delivery, registry, provenance, claims
`pipeline/`	Tier logic, stage helpers
`retrieval/`	Expand, freshness, RRF/additive fusion, PLR, intent router, SR, seeded edges, query classifier
`scoring/`	Cymatics, know-calibration, know-decision, ray-trace, TCM
`server/`	FastAPI app factory + route modules (context, ingest, registry, admin)
`storage/`	DDL, indexes, co-activation graph
`telemetry/`	OTel metrics, histogram instrumentation
`vault/`	Obsidian vault export (diagnostic traces)
`launcher/`	System-tray supervisor
`mcp/`	MCP tool surface for Claude Code / Desktop
`integrations/`	ScoreRift bridge

Back-compat shims: genome.py, ribosome.py, server.py, replication.py, hgt.py re-export from new locations. Lexicon: docs/ROSETTA.md.

IDE + MCP integration

MCP setup (Claude Code / Cursor / Claude Desktop)

{
  "mcpServers": {
    "helix-context": {
      "command": "python",
      "args": ["-m", "helix_context.mcp_server"],
      "cwd": "/absolute/path/to/your/project",
      "env": { "HELIX_MCP_URL": "http://127.0.0.1:11437" }
    }
  }
}

Continue IDE

models:
  - name: Helix (Local)
    provider: openai
    model: gemma3:e4b
    apiBase: http://127.0.0.1:11437/v1
    apiKey: EMPTY
    roles: [chat]
    defaultCompletionOptions:
      contextLength: 128000
      maxTokens: 4096

OpenAI-compatible proxy (zero code changes)

OPENAI_BASE_URL=http://localhost:11437/v1 your-app

Knowledge store management

[genome]
path = "genomes/main/genome.db"   # relative to helix run directory

Backup (safe while running — WAL mode):

cp genomes/main/genome.db backups/genome-$(date +%Y%m%d).db

BGE-M3 backfill (one-time, after install):

python scripts/backfill_bgem3_v2.py genomes/main/genome.db

Observability

scripts\setup-grafana-telem.ps1     # Windows
scripts/setup-grafana-telem.sh      # Linux / macOS

Dashboard: http://localhost:3000/d/helix-overview. Full surface: docs/architecture/OBSERVABILITY.md.

Gotchas

Knowledge store path is genomes/main/genome.db (not project root). Delete to start fresh.
BGE-M3 backfill is one-time post-install — embedding_dense_v2 IS NULL until you run scripts/backfill_bgem3_v2.py. Low retrieval rate without it.
Fusion mode defaults to "additive" (back-compat). Flip to "rrf" in [retrieval] after running scripts/calibrate_thresholds.py.
Session delivery (session_delivery_enabled = true) tracks delivered docs per session, elides repeats. ~40% token savings on multi-turn. Pass ignore_delivered: true in /context body for benchmarks.
know/miss contract requires the agent prompt fragment to be honored — without it, frontier models confabulate. Import helix_context.agent_prompt.full_fragment().
Naming lexicon: biology terms (gene, genome, ribosome) have canonical software equivalents (document, knowledge store, compressor). Both work in code; new code uses software terms. See docs/ROSETTA.md.

Testing

python -m pytest tests/ -m "not live" -v   # ~1950 tests, no external services

Documentation

Start here	Go deeper
Setup guide	Pipeline lanes
Troubleshooting	Retrieval dimensions
`/context` API	Knowledge graph
Config reference	Session registry
Agent SDK fragment	Observability
Operator runbooks	Launcher architecture

Acknowledgments

Built on: spaCy NER · Howard 2005 TCM · Stachenfeld 2017 SR · SQLite FTS5 BM25 · BGE-M3 · Kompress · Headroom

License

Apache-2.0. See NOTICE for third-party attributions.

Name		Name	Last commit message	Last commit date
Latest commit History 506 Commits
.github		.github
.obsidian		.obsidian
benchmarks		benchmarks
deploy		deploy
docs		docs
examples		examples
helix_context		helix_context
overnight_logs		overnight_logs
scripts		scripts
skills/helix		skills/helix
test_cases		test_cases
tests		tests
tools/native-otel		tools/native-otel
training		training
.env.example		.env.example
.gitignore		.gitignore
BENCHMARK_NOTES.md		BENCHMARK_NOTES.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
GEMINI.md		GEMINI.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
backend-with-otel.bat		backend-with-otel.bat
helix.toml		helix.toml
launcher-with-otel.bat		launcher-with-otel.bat
pyproject.toml		pyproject.toml
setup-helix.bat		setup-helix.bat
start-helix-mcpo.bat		start-helix-mcpo.bat
start-helix-tray.bat		start-helix-tray.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Helix Context

Proof (30 seconds)

Get started (60 seconds)

Agent surfaces

Pipeline (2 minutes)

IDE + MCP integration

Knowledge store management

Observability

Gotchas

Testing

Documentation

Acknowledgments

License

About

Uh oh!

Releases 13

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Helix Context

Proof (30 seconds)

Get started (60 seconds)

Agent surfaces

Pipeline (2 minutes)

IDE + MCP integration

Knowledge store management

Observability

Gotchas

Testing

Documentation

Acknowledgments

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 13

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages