Skip to content

Commit 2ebb76f

Browse files
w7-mgfcodew7-learnclaude
authored
chore: release v0.2.0 (#37)
* feat(registry): implement model registry for run tracking and deployments (#36) * docs: expand INITIAL-7 with lifecycle, lineage, and artifact integrity details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(registry): implement model registry for run tracking and deployments Add model registry feature (PRP-7) with: - ORM models: ModelRun with JSONB columns (model_config, metrics, runtime_info), DeploymentAlias for mutable deployment pointers - Storage: LocalFSProvider with SHA-256 integrity verification and path traversal prevention, abstract interface for future S3/GCS support - Service: RegistryService with state machine validation, duplicate detection, config hashing, and run comparison - API endpoints: CRUD for runs and aliases, artifact verification, run comparison with config/metrics diffs - Database: Alembic migration with GIN indexes for JSONB containment queries - Tests: 103 unit tests (schemas, storage, service) + 24 integration tests - Example: registry_demo.py demonstrating full workflow Run lifecycle: PENDING → RUNNING → SUCCESS/FAILED → ARCHIVED Aliases can only point to SUCCESS runs for deployment safety. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: update documentation for model registry implementation - README.md: Add registry to project structure, API endpoints section, and example reference - docs/ARCHITECTURE.md: Update section 7.6 with full implementation details, add registry endpoints to section 8, mark Phase 1 complete - docs/PHASE-index.md: Mark phases 4-6 as completed, add detailed completion entries for Forecasting, Backtesting, and Registry Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: add PHASE documentation for forecasting, backtesting, and registry Create missing phase documentation files to complete the project's implementation records: - 4-FORECASTING.md: Model zoo with BaseForecaster interface, train/predict endpoints, and joblib persistence - 5-BACKTESTING.md: Time-series CV with expanding/sliding strategies, metrics calculation, and baseline comparisons - 6-MODEL_REGISTRY.md: Run tracking with state machine, deployment aliases, and SHA-256 artifact integrity verification Update PHASE-index.md to link to the new documentation files. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix(registry): resolve type checking issues with Pydantic model_config alias - Add pydantic.mypy plugin to pyproject.toml for proper Pydantic type checking - Use model_config_data instead of model_config alias in tests to avoid collision with Pydantic's reserved model_config attribute - Update _model_to_response to use model_validate() for proper alias handling - Change docker-compose postgres port to 5433 to avoid conflicts Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: resolve CI failures for registry PR - Import registry models in alembic/env.py for schema validation - Fix import order and remove extraneous f-strings in registry_demo.py - Add type: ignore comments for frozen model tests with pydantic.mypy plugin Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: prevent db_session fixtures from dropping all tables The data_platform and root conftest.py db_session fixtures were dropping all tables after each test, causing subsequent integration tests to fail when they couldn't find migrated tables. Changes: - Remove Base.metadata.drop_all from db_session fixtures - Tests now rely on migrations for table creation - Each test just rolls back its own changes Also fixes ruff format issue in examples/registry_demo.py. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: add proper test data cleanup to db_session fixtures Update data_platform and ingest test fixtures to clean up test data explicitly instead of dropping all tables or just rolling back. - data_platform: delete test stores, products, calendar entries - ingest: delete test stores, products, sales, calendar entries This ensures test isolation while preserving migrated tables. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use separate session for test cleanup to avoid transaction issues When tests cause integrity errors, the session enters a failed state. Use a fresh session for cleanup to avoid PendingRollbackError. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: use contextlib.suppress instead of try-except-pass Replace try-except-pass patterns with contextlib.suppress to satisfy ruff S110 linting rule. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> * fix: code improvements and documentation fixes - Add date range filter to SalesDaily cleanup in ingest tests - Enforce artifact_hash presence before verification in registry routes - Compute SHA256 from saved file instead of source in storage - Fix override_get_db to mirror production transaction semantics - Filter DeploymentAlias cleanup to only test runs - Update database port to 5433 in config and .env.example - Add language identifiers to fenced code blocks (MD040) - Fix table formatting for markdownlint MD060 - Update PR reference in PHASE/6-MODEL_REGISTRY.md - Convert bare URLs to markdown links in INITIAL-7.md - Wrap __init__.py in backticks in PRP-7 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Gabe@w7dev <gabor@w7-7.net> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 9a8e0ac commit 2ebb76f

35 files changed

Lines changed: 6740 additions & 69 deletions

.env.example

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# Copy this file to .env and adjust values as needed
33

44
# Database connection (PostgreSQL + pgvector via Docker Compose)
5-
DATABASE_URL=postgresql+asyncpg://forecastlab:forecastlab@localhost:5432/forecastlab
5+
DATABASE_URL=postgresql+asyncpg://forecastlab:forecastlab@localhost:5433/forecastlab
66

77
# Application settings
88
APP_NAME=ForecastLabAI

INITIAL-7.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,17 @@
1212
- Artifact storage abstraction:
1313
- local filesystem by default (Settings-driven)
1414
- compatible with future S3-like storage backends
15+
- Lifecycle Management:
16+
- State machine tracking: PENDING | RUNNING | SUCCESS | FAILED | ARCHIVED.
17+
- Deployment Aliases: Mutable pointers (e.g., 'prod-v1') to specific successful runs.
18+
- Metadata & Lineage:
19+
- JSONB storage for ModelConfig, FeatureConfig, and Performance Metrics.
20+
- Runtime Snapshot: Recording Python/Library versions for environment parity.
21+
- Agent Context: Integration of agent_id and session_id for autonomous run traceability.
22+
- Artifact Integrity:
23+
- Checksum-based verification (SHA-256) for all serialized artifacts.
24+
- Storage Strategy:
25+
- Pluggable storage providers (LocalFS, future S3/GCS) via Abstract Registry Interface.
1526

1627
## EXAMPLES:
1728
- `examples/registry/create_run.py` — create run record + persist configs.
@@ -21,6 +32,8 @@
2132
## DOCUMENTATION:
2233
- Postgres JSONB patterns
2334
- Artifact integrity (hashing) best practices
35+
- [Using JSONB in PostgreSQL](https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-effectively-store-index-json-data-in-postgresql/)
36+
- [Supply Chain Vulnerability](https://www.fortra.com/blog/supply-chain-vulnerability)
2437

2538
## OTHER CONSIDERATIONS:
2639
- No hardcoded artifact paths: derived from `ARTIFACT_ROOT` + run_id.

PRPs/PRP-7-model-registry.md

Lines changed: 1253 additions & 0 deletions
Large diffs are not rendered by default.

README.md

Lines changed: 44 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -118,7 +118,8 @@ app/
118118
│ ├── ingest/ # Batch upsert endpoints for sales data
119119
│ ├── featuresets/ # Time-safe feature engineering (lags, rolling, calendar)
120120
│ ├── forecasting/ # Model training, prediction, persistence
121-
│ └── backtesting/ # Time-series CV, metrics, baseline comparisons
121+
│ ├── backtesting/ # Time-series CV, metrics, baseline comparisons
122+
│ └── registry/ # Model run tracking, artifacts, deployment aliases
122123
└── main.py # FastAPI entry point
123124
124125
tests/ # Test fixtures and helpers
@@ -129,7 +130,8 @@ examples/
129130
├── queries/ # Example SQL queries
130131
├── models/ # Baseline model examples (naive, seasonal_naive, moving_average)
131132
├── backtest/ # Backtesting examples (run_backtest, inspect_splits, metrics_demo)
132-
└── compute_features_demo.py # Feature engineering demo
133+
├── compute_features_demo.py # Feature engineering demo
134+
└── registry_demo.py # Model registry workflow demo
133135
scripts/ # Utility scripts
134136
```
135137

@@ -301,6 +303,46 @@ When `include_baselines=true`, automatically compares against naive and seasonal
301303

302304
See [examples/backtest/](examples/backtest/) for usage examples.
303305

306+
### Model Registry
307+
308+
- `POST /registry/runs` - Create a new model run
309+
- `GET /registry/runs` - List runs with filtering and pagination
310+
- `GET /registry/runs/{run_id}` - Get run details
311+
- `PATCH /registry/runs/{run_id}` - Update run (status, metrics, artifacts)
312+
- `GET /registry/runs/{run_id}/verify` - Verify artifact integrity
313+
- `POST /registry/aliases` - Create or update deployment alias
314+
- `GET /registry/aliases` - List all aliases
315+
- `GET /registry/aliases/{alias_name}` - Get alias details
316+
- `DELETE /registry/aliases/{alias_name}` - Delete an alias
317+
- `GET /registry/compare/{run_id_a}/{run_id_b}` - Compare two runs
318+
319+
**Example Create Run Request:**
320+
```bash
321+
curl -X POST http://localhost:8123/registry/runs \
322+
-H "Content-Type: application/json" \
323+
-d '{
324+
"model_type": "seasonal_naive",
325+
"model_config": {"season_length": 7},
326+
"data_window_start": "2024-01-01",
327+
"data_window_end": "2024-03-31",
328+
"store_id": 1,
329+
"product_id": 1
330+
}'
331+
```
332+
333+
**Run Lifecycle:**
334+
- `pending``running``success` | `failed``archived`
335+
- Aliases can only point to runs with `success` status
336+
337+
**Features:**
338+
- JSONB storage for model_config, metrics, runtime_info
339+
- SHA-256 artifact integrity verification
340+
- Duplicate detection (configurable: allow/deny/detect)
341+
- Runtime environment capture (Python, numpy, pandas versions)
342+
- Agent context tracking for autonomous workflows
343+
344+
See [examples/registry_demo.py](examples/registry_demo.py) for a complete workflow demo.
345+
304346
## API Documentation
305347

306348
Once the server is running:

alembic/env.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313

1414
# Import all models for Alembic autogenerate detection
1515
from app.features.data_platform import models as data_platform_models # noqa: F401
16+
from app.features.registry import models as registry_models # noqa: F401
1617

1718
# Alembic Config object
1819
config = context.config
Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
"""create_model_registry_tables
2+
3+
Revision ID: a2f7b3c8d901
4+
Revises: e1165ebcef61
5+
Create Date: 2026-02-01 10:00:00.000000
6+
7+
"""
8+
9+
from typing import Sequence, Union
10+
11+
from alembic import op
12+
import sqlalchemy as sa
13+
from sqlalchemy.dialects import postgresql
14+
15+
# revision identifiers, used by Alembic.
16+
revision: str = "a2f7b3c8d901"
17+
down_revision: Union[str, None] = "e1165ebcef61"
18+
branch_labels: Union[str, Sequence[str], None] = None
19+
depends_on: Union[str, Sequence[str], None] = None
20+
21+
22+
def upgrade() -> None:
23+
"""Apply migration - create model_run and deployment_alias tables."""
24+
# Create model_run table
25+
op.create_table(
26+
"model_run",
27+
sa.Column("id", sa.Integer(), nullable=False),
28+
sa.Column("run_id", sa.String(length=32), nullable=False),
29+
sa.Column("status", sa.String(length=20), nullable=False, server_default="pending"),
30+
# Model configuration
31+
sa.Column("model_type", sa.String(length=50), nullable=False),
32+
sa.Column("model_config", postgresql.JSONB(astext_type=sa.Text()), nullable=False),
33+
sa.Column("feature_config", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
34+
sa.Column("config_hash", sa.String(length=16), nullable=False),
35+
# Data window
36+
sa.Column("data_window_start", sa.Date(), nullable=False),
37+
sa.Column("data_window_end", sa.Date(), nullable=False),
38+
sa.Column("store_id", sa.Integer(), nullable=False),
39+
sa.Column("product_id", sa.Integer(), nullable=False),
40+
# Metrics
41+
sa.Column("metrics", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
42+
# Artifact info
43+
sa.Column("artifact_uri", sa.String(length=500), nullable=True),
44+
sa.Column("artifact_hash", sa.String(length=64), nullable=True),
45+
sa.Column("artifact_size_bytes", sa.Integer(), nullable=True),
46+
# Environment & lineage
47+
sa.Column("runtime_info", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
48+
sa.Column("agent_context", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
49+
sa.Column("git_sha", sa.String(length=40), nullable=True),
50+
# Error tracking
51+
sa.Column("error_message", sa.String(length=2000), nullable=True),
52+
# Timing
53+
sa.Column("started_at", sa.DateTime(timezone=True), nullable=True),
54+
sa.Column("completed_at", sa.DateTime(timezone=True), nullable=True),
55+
# Timestamps (from TimestampMixin)
56+
sa.Column(
57+
"created_at",
58+
sa.DateTime(timezone=True),
59+
server_default=sa.text("now()"),
60+
nullable=False,
61+
),
62+
sa.Column(
63+
"updated_at",
64+
sa.DateTime(timezone=True),
65+
server_default=sa.text("now()"),
66+
nullable=False,
67+
),
68+
# Constraints
69+
sa.PrimaryKeyConstraint("id"),
70+
sa.CheckConstraint(
71+
"status IN ('pending', 'running', 'success', 'failed', 'archived')",
72+
name="ck_model_run_valid_status",
73+
),
74+
sa.CheckConstraint(
75+
"data_window_end >= data_window_start",
76+
name="ck_model_run_valid_data_window",
77+
),
78+
)
79+
80+
# Create indexes for model_run
81+
op.create_index(op.f("ix_model_run_run_id"), "model_run", ["run_id"], unique=True)
82+
op.create_index(op.f("ix_model_run_status"), "model_run", ["status"], unique=False)
83+
op.create_index(op.f("ix_model_run_model_type"), "model_run", ["model_type"], unique=False)
84+
op.create_index(op.f("ix_model_run_config_hash"), "model_run", ["config_hash"], unique=False)
85+
op.create_index(op.f("ix_model_run_store_id"), "model_run", ["store_id"], unique=False)
86+
op.create_index(op.f("ix_model_run_product_id"), "model_run", ["product_id"], unique=False)
87+
88+
# Composite indexes
89+
op.create_index(
90+
"ix_model_run_store_product", "model_run", ["store_id", "product_id"], unique=False
91+
)
92+
op.create_index(
93+
"ix_model_run_data_window",
94+
"model_run",
95+
["data_window_start", "data_window_end"],
96+
unique=False,
97+
)
98+
99+
# GIN indexes for JSONB containment queries
100+
op.create_index(
101+
"ix_model_run_model_config_gin",
102+
"model_run",
103+
["model_config"],
104+
unique=False,
105+
postgresql_using="gin",
106+
)
107+
op.create_index(
108+
"ix_model_run_metrics_gin",
109+
"model_run",
110+
["metrics"],
111+
unique=False,
112+
postgresql_using="gin",
113+
)
114+
115+
# Create deployment_alias table
116+
op.create_table(
117+
"deployment_alias",
118+
sa.Column("id", sa.Integer(), nullable=False),
119+
sa.Column("alias_name", sa.String(length=100), nullable=False),
120+
sa.Column("run_id", sa.Integer(), nullable=False),
121+
sa.Column("description", sa.String(length=500), nullable=True),
122+
# Timestamps (from TimestampMixin)
123+
sa.Column(
124+
"created_at",
125+
sa.DateTime(timezone=True),
126+
server_default=sa.text("now()"),
127+
nullable=False,
128+
),
129+
sa.Column(
130+
"updated_at",
131+
sa.DateTime(timezone=True),
132+
server_default=sa.text("now()"),
133+
nullable=False,
134+
),
135+
# Constraints
136+
sa.PrimaryKeyConstraint("id"),
137+
sa.ForeignKeyConstraint(["run_id"], ["model_run.id"]),
138+
sa.UniqueConstraint("alias_name", name="uq_deployment_alias_name"),
139+
)
140+
141+
# Create indexes for deployment_alias
142+
op.create_index(
143+
op.f("ix_deployment_alias_alias_name"),
144+
"deployment_alias",
145+
["alias_name"],
146+
unique=True,
147+
)
148+
op.create_index(
149+
op.f("ix_deployment_alias_run_id"), "deployment_alias", ["run_id"], unique=False
150+
)
151+
152+
153+
def downgrade() -> None:
154+
"""Revert migration - drop model_run and deployment_alias tables."""
155+
# Drop deployment_alias table and indexes
156+
op.drop_index(op.f("ix_deployment_alias_run_id"), table_name="deployment_alias")
157+
op.drop_index(op.f("ix_deployment_alias_alias_name"), table_name="deployment_alias")
158+
op.drop_table("deployment_alias")
159+
160+
# Drop model_run indexes
161+
op.drop_index("ix_model_run_metrics_gin", table_name="model_run")
162+
op.drop_index("ix_model_run_model_config_gin", table_name="model_run")
163+
op.drop_index("ix_model_run_data_window", table_name="model_run")
164+
op.drop_index("ix_model_run_store_product", table_name="model_run")
165+
op.drop_index(op.f("ix_model_run_product_id"), table_name="model_run")
166+
op.drop_index(op.f("ix_model_run_store_id"), table_name="model_run")
167+
op.drop_index(op.f("ix_model_run_config_hash"), table_name="model_run")
168+
op.drop_index(op.f("ix_model_run_model_type"), table_name="model_run")
169+
op.drop_index(op.f("ix_model_run_status"), table_name="model_run")
170+
op.drop_index(op.f("ix_model_run_run_id"), table_name="model_run")
171+
172+
# Drop model_run table
173+
op.drop_table("model_run")

app/core/config.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ class Settings(BaseSettings):
2121
debug: bool = False
2222

2323
# Database
24-
database_url: str = "postgresql+asyncpg://forecastlab:forecastlab@localhost:5432/forecastlab"
24+
database_url: str = "postgresql+asyncpg://forecastlab:forecastlab@localhost:5433/forecastlab"
2525

2626
# Logging
2727
log_level: Literal["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"] = "INFO"
@@ -53,6 +53,10 @@ class Settings(BaseSettings):
5353
backtest_max_gap: int = 30
5454
backtest_results_dir: str = "./artifacts/backtests"
5555

56+
# Registry
57+
registry_artifact_root: str = "./artifacts/registry"
58+
registry_duplicate_policy: Literal["allow", "deny", "detect"] = "detect"
59+
5660
@property
5761
def is_development(self) -> bool:
5862
"""Check if running in development mode."""

app/features/backtesting/tests/test_schemas.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@ def test_frozen_config(self):
9393
"""Test SplitConfig is immutable."""
9494
config = SplitConfig()
9595
with pytest.raises(ValidationError):
96-
config.n_splits = 10
96+
config.n_splits = 10 # type: ignore[misc]
9797

9898

9999
class TestBacktestConfig:
@@ -136,7 +136,7 @@ def test_frozen_config(self):
136136
"""Test BacktestConfig is immutable."""
137137
config = BacktestConfig(model_config_main=NaiveModelConfig())
138138
with pytest.raises(ValidationError):
139-
config.include_baselines = False
139+
config.include_baselines = False # type: ignore[misc]
140140

141141
def test_invalid_schema_version(self):
142142
"""Test invalid schema_version raises error."""

0 commit comments

Comments
 (0)