npow
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 32 additions & 0 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎.github/workflows/publish.yml‎
Lines changed: 28 additions & 0 deletions b/‎.github/workflows/publish.yml‎
Lines changed: 28 additions & 0 deletions
diff --git a/‎.gitignore‎
Lines changed: 43 additions & 0 deletions b/‎.gitignore‎
Lines changed: 43 additions & 0 deletions
diff --git a/‎AGENTS.md‎
Lines changed: 68 additions & 0 deletions b/‎AGENTS.md‎
Lines changed: 68 additions & 0 deletions
diff --git a/‎LICENSE‎
Lines changed: 21 additions & 0 deletions b/‎LICENSE‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 69 additions & 0 deletions b/‎README.md‎
Lines changed: 69 additions & 0 deletions
@@ -0,0 +1,32 @@
+name: CI
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.10", "3.12"]
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4
+
+      - name: Set up Python ${{ matrix.python-version }}
+        run: uv python install ${{ matrix.python-version }}
+
+      - name: Install dependencies
+        run: uv sync --extra dev
+
+      - name: Lint
+        run: uv run ruff check src/ tests/
+
+      - name: Test
+        run: uv run pytest -v
@@ -0,0 +1,28 @@
+name: Publish to PyPI
+
+on:
+  release:
+    types: [published]
+
+permissions:
+  id-token: write
+
+jobs:
+  publish:
+    runs-on: ubuntu-latest
+    environment: pypi
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4
+
+      - name: Set up Python
+        run: uv python install 3.12
+
+      - name: Build package
+        run: uv build
+
+      - name: Publish to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
@@ -0,0 +1,43 @@
+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.egg-info/
+*.egg
+dist/
+build/
+.eggs/
+
+# Virtual environments
+.venv/
+venv/
+ENV/
+
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+*~
+.DS_Store
+
+# Testing
+.pytest_cache/
+.coverage
+htmlcov/
+.mypy_cache/
+.ruff_cache/
+
+# Benchmark reports (generated)
+benchmarks/reports/
+
+# HuggingFace cache (downloaded datasets)
+.cache/
+hub/
+
+# Environment
+.env
+.env.local
+
+# uv
+uv.lock
@@ -0,0 +1,68 @@
+# AGENTS.md — Kompact Context Optimization Proxy
+
+## What is Kompact?
+
+A transparent proxy that optimizes LLM context through multi-layer transforms.
+Sits between agents (Claude Code, Cursor, etc.) and providers (Anthropic, OpenAI).
+
+## Architecture
+
+```
+Request → Proxy → [Layer 1: Schema] → [Layer 2: Content] → [Layer 3: History] → [Layer 4: Cache] → Provider
+```
+
+## Entry Points
+
+| What | Where | Notes |
+|------|-------|-------|
+| CLI | `src/kompact/__main__.py` | `kompact proxy --port 7878` |
+| Proxy server | `src/kompact/proxy/server.py` | FastAPI, intercepts API requests |
+| Transform pipeline | `src/kompact/transforms/pipeline.py` | Orchestrates all transforms |
+| Configuration | `src/kompact/config.py` | Pydantic settings |
+| Core types | `src/kompact/types.py` | Message, ToolOutput, TransformResult |
+
+## Transforms (each is independent, pure function)
+
+| Transform | File | Layer | Typical Savings |
+|-----------|------|-------|-----------------|
+| TOON format | `src/kompact/transforms/toon.py` | 2 (Content) | 30-60% on JSON arrays |
+| Observation masker | `src/kompact/transforms/observation_masker.py` | 3 (History) | 50% on old tool outputs |
+| Cache aligner | `src/kompact/transforms/cache_aligner.py` | 4 (Cache) | Enables provider caching |
+| JSON crusher | `src/kompact/transforms/json_crusher.py` | 2 (Content) | 40-80% on structured data |
+| Schema optimizer | `src/kompact/transforms/schema_optimizer.py` | 1 (Schema) | 50-90% on tool defs |
+| Code compressor | `src/kompact/transforms/code_compressor.py` | 2 (Content) | ~70% on code blocks |
+| Log compressor | `src/kompact/transforms/log_compressor.py` | 2 (Content) | 60-90% on log output |
+
+## Key Invariants
+
+1. **All transforms are pure functions**: `list[Message] → TransformResult`
+2. **No transform modifies user messages** — only assistant/tool/system content
+3. **Every transform tracks `tokens_saved`** via `TransformResult`
+4. **Transforms are composable** — pipeline runs them in sequence
+
+## Documentation
+
+| Doc | Path | Purpose |
+|-----|------|---------|
+| PRD | `docs/prd.md` | Product requirements |
+| SDD | `docs/sdd.md` | System design |
+| Architecture | `docs/architecture.md` | Layer details |
+| Benchmarks | `docs/benchmarks.md` | Evaluation strategy |
+| Quality | `docs/quality.md` | Quality grades per domain |
+| Research | `docs/research/` | SOTA survey, competitors, economics |
+
+## Testing
+
+```bash
+uv run pytest                           # All tests
+uv run pytest tests/test_toon.py        # Single transform
+uv run python benchmarks/compression_ratio.py  # Benchmarks
+```
+
+## Quick Start
+
+```bash
+uv sync
+uv run kompact proxy --port 7878
+# Then: ANTHROPIC_BASE_URL=http://localhost:7878 claude
+```
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2025 Kompact Contributors
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
@@ -0,0 +1,69 @@
+# Kompact
+
+[![CI](https://github.com/npow/kompact/actions/workflows/ci.yml/badge.svg)](https://github.com/npow/kompact/actions/workflows/ci.yml)
+[![PyPI](https://img.shields.io/pypi/v/kompact.svg)](https://pypi.org/project/kompact/)
+[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)
+
+Context compression proxy for LLM agents. Reduces token usage 40-70% with zero code changes.
+
+## Quick Start
+
+```bash
+uv sync
+uv run kompact proxy --port 7878
+
+# Point your agent at it
+export ANTHROPIC_BASE_URL=http://localhost:7878
+```
+
+## Benchmarks
+
+### Offline Compression (12,795 examples)
+
+How much does each system compress, and does the answer survive? Evaluated with [context-bench](https://pypi.org/project/context-bench/).
+
+| Dataset | Examples | Kompact Compression | NIAH | Effective Ratio |
+|---------|----------|--------------------:|-----:|----------------:|
+| BFCL | 1,431 | 55.3% | 90% | 48.2% |
+| Glaive v2 | 3,959 | 56.6% | 100% | 56.6% |
+| HotpotQA | 7,405 | 17.9% | 91% | 8.8% |
+
+### End-to-End Quality (9,336 examples)
+
+Does compression change the LLM's answers? Each example sent through Claude (via [claude-relay](https://github.com/npow/claude-relay)) with no compression, Kompact, and [Headroom](https://github.com/headroom-ai/headroom) (SmartCrusher + ToolCrusher). **Contains** = answer found in LLM response.
+
+| Dataset | Examples | Baseline | Kompact | Headroom |
+|---------|----------|--------:|--------:|---------:|
+| **BFCL** | 1,431 | 29.3% | **36.4%** | 31.4% |
+| **HotpotQA** | 7,405 | 80.6% | 80.3% | 80.6% |
+
+Kompact **improves** answer quality on agentic workloads (+7.1% vs baseline, +5.0% vs Headroom on BFCL) — compression removes noise from tool schemas, helping the model focus. Quality is preserved on prose (HotpotQA within 0.3%).
+
+### Cost Impact
+
+Assuming 1,000 agentic requests/day with average 10K-token contexts (typical for tool-calling agents):
+
+| Model | Before | After (55% compression) | Monthly Savings |
+|-------|-------:|------------------------:|----------------:|
+| Claude Sonnet ($3/M) | $900/mo | $405/mo | **$495/mo** |
+| Claude Opus ($15/M) | $4,500/mo | $2,025/mo | **$2,475/mo** |
+| GPT-4o ($2.50/M) | $750/mo | $338/mo | **$412/mo** |
+
+```bash
+uv run python benchmarks/run_dataset_eval.py  # offline compression
+uv run python benchmarks/run_e2e_eval.py      # end-to-end through proxy
+```
+
+See [`benchmarks/README.md`](benchmarks/README.md) for full methodology and per-scenario results.
+
+## Development
+
+```bash
+uv sync --extra dev
+uv run pytest
+uv run ruff check src/ tests/
+```
+
+## License
+
+MIT