feat(forecast): add feature frame v2#300
Conversation
Lands V2 feature-frame contract as additive, opt-in surface alongside frozen V1. Training + scenarios + shared builders complete; backtesting V2 dispatch deferred to follow-up tracked in #299. V1 callers unchanged. - Shared layer: V2 manifest (38 default / 53 max columns), sidecars, row builders - Training: TrainRequest gains feature_frame_version + feature_groups (opt-in) - Scenarios: build_future_frame dispatches V1/V2 via bundle metadata - 3 LOAD-BEARING leakage specs land alongside the V1 spec - No Alembic migration (V2 reads existing tables, writes nothing) - V1 bundles load/predict/scenario-simulate/backtest unchanged
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Sorry @w7-mgfcode, your pull request is larger than the review limit of 150000 diff characters
feat(forecast): add feature frame v2
Tracking issue: #299 (under Forecast Intelligence roadmap epic #295).
PRP:
PRPs/PRP-35-forecast-intelligence-A-feature-frame-v2.md.Summary
Lands the V2 feature-frame contract as an additive, opt-in surface
alongside the frozen V1 contract:
FeatureGroups),V2HistoricalSidecar/V2FutureSidecardata carriers,and
build_historical_feature_rows_v2/build_future_feature_rows_v2pure row builders.
app/shared/feature_frames/stays leaf-level.POST /forecasting/trainaccepts optionalfeature_frame_version: int = 1andfeature_groups: list[str] | None = None.V2 bundles persist
feature_frame_version,feature_columns,feature_groups,feature_safety_classes, andfeature_pinned_constantsin bundle metadata.
POST /scenarios/simulatereadsfeature_frame_versionfrom the loaded bundle metadata and dispatches V1 vs V2 future-frame
assembly transparently.
spec; never to be weakened:
app/shared/feature_frames/tests/test_leakage_v2.pyapp/features/forecasting/tests/test_regression_features_v2_leakage.pyapp/features/scenarios/tests/test_future_frame_v2_leakage.pyV1 compatibility (back-compat invariant)
(
app/shared/feature_frames/tests/test_leakage.py) and 22 sibling V1contract tests remain green without modification.
backtest unchanged.
feature_frame_version=1is the default everywhere; legacy bundles thatpredate the metadata field are treated as V1 via
bundle.metadata.get("feature_frame_version", 1).feature_frame_versionlives onTrainRequest, not onModelConfigBase— adding it to the config would mutate every existing V1config_hash()and orphan registry rows / aliases. Persisted to bundlemetadata instead.
V2 opt-in behaviour
TrainRequestwithfeature_frame_version=2(optionallyfeature_groups=[…])triggers the V2 path; otherwise V1 runs unchanged.
feature_groupssupplied → 422.FeatureGroupname → 422.TARGET_HISTORY,CALENDAR,ROLLING,TREND,PRICE_PROMO,LIFECYCLE(38 columns). Phase-2 sidecar groups(
INVENTORY,REPLENISHMENT,RETURNS,EXOGENOUS_WEATHER,EXOGENOUS_MACRO) are off by default so the MVP stays green on smallerseeded DBs (max 53 columns when all enabled).
EXOGENOUS_LAGS_V2=(1,7,14,28,56,364),ROLLING_WINDOWS_V2=(7,28,90),TREND_WINDOWS_V2=(30,90),HISTORY_TAIL_DAYS_V2=400.Validation
All four mandatory gates green locally on
Python 3.12:40 V2 leakage tests across 3 LOAD-BEARING files all green; 23 V1 contract /
leakage tests byte-stable.
The 3 mypy + 8 pyright pre-existing errors stem from optional
lightgbm/xgboostextras and are unrelated to PRP-35; CI runs--all-extrasand won'tsee them.
No Alembic migration
V2 reads only existing tables (
inventory_snapshot_daily,replenishment_event,sales_returns,exogenous_signal,promotion,product) and writes nothing to the DB.alembic headsunchanged atc1d2e3f40512.Deferred: V2 backtesting dispatch — tracked in #299
PRP-35 lands V2 training + scenarios + shared builders. Backtesting V2
dispatch is deferred and explicitly tracked in the
"Deferred follow-up: V2 backtesting dispatch" section of #299.
PRP-35 Task 13 reads "READ
feature_frame_versionfrom the fitted bundleBEFORE the fold loop", but
app/features/backtesting/service.py:_run_model_backtesttrains fresh perfold from
BacktestConfig.model_config_mainand does not load a fittedbundle. The correct opt-in surface is a request-time field on
BacktestConfigitself — a re-design Task 13 did not spec.This PR does NOT claim completion of PRP-35 Tasks 13 or 18. V1
backtesting is unchanged; a V2-trained bundle still trains and
scenario-simulates correctly. Only
/backtesting/runremains V1-only untilthe follow-up under #299 lands. Integration tests (PRP-35 Tasks 15 + 16) and
the PHASE/3 + PHASE/4 doc edits (Task 21) are also deferred there.
qwen3 stash status
The session's
stash@{0}("local qwen3 rag demo changes before prp-35",app/features/rag/models.py+7/-2) is not applied, not popped, notdropped. The decision on it (write a real
INITIAL-rag-embedding-provider-pluggability.mddoc vs. add to.git/info/exclude) is carryover work, untouched by this PR.Files changed
Test plan
/forecasting/trainacceptsfeature_frame_version=2with default groups./forecasting/trainacceptsfeature_frame_version=2with opt-inPhase-2 group (e.g.
INVENTORY) on a seeded DB carrying inventory rows./scenarios/simulateagainst a V2-trained bundle produces amodel_exogenousre-forecast (V2 future-frame assembly via bundle metadata).scenario-simulates unchanged.
/backtesting/runagainst a V2-trained bundle remains V1-only(no V2 dispatch on the fold loop) — documented deferral above.