From e6493128b1350c688fbdc10871bddfa73701afb1 Mon Sep 17 00:00:00 2001
From: Tom Medhurst <tom.medhurst@bradyplc.com>
Date: Mon, 6 Apr 2026 11:04:56 +0100
Subject: [PATCH] Add spec 2 for later implementation

---
 .claude/specs/spec-02.md | 286 +++++++++++++++++++++++++
 .vscode/settings.json    |  29 +++
 CLAUDE.md                | 122 ++++++++---
 README.md                | 439 +++++++++++++++++++--------------------
 4 files changed, 619 insertions(+), 257 deletions(-)
 create mode 100644 .claude/specs/spec-02.md
 create mode 100644 .vscode/settings.json

diff --git a/.claude/specs/spec-02.md b/.claude/specs/spec-02.md
new file mode 100644
index 0000000..db7786b
--- /dev/null
+++ b/.claude/specs/spec-02.md
@@ -0,0 +1,286 @@
+# Task 02: Signals, CSV Signal Loader, and CLI
+
+## Goal
+
+Add the signal system and a basic CLI so a customer can bring their own
+forecast data (as CSV), reference it from their algo, and run a backtest
+from the command line rather than writing a Python script.
+
+This builds on task-01. All existing types, the DA matching engine, and
+the PnL/VWAP analysis remain unchanged.
+
+---
+
+## What to build
+
+### 1. `signals/base.py`
+
+The `SignalProvider` protocol and supporting types:
+
+```python
+class SignalSchema:
+    """Describes what a signal provides."""
+    name: str
+    dtype: type          # float, int, str
+    frequency: timedelta # How often it updates
+    description: str
+    unit: str            # "EUR/MWh", "MW", "m/s", "celsius", etc.
+
+class SignalValue:
+    """A single signal observation."""
+    timestamp: datetime  # When this value is valid for
+    value: Any           # The actual value
+
+class SignalProvider(Protocol):
+    @property
+    def name(self) -> str: ...
+
+    @property
+    def schema(self) -> SignalSchema: ...
+
+    def get_value(self, timestamp: datetime) -> SignalValue: ...
+    def get_range(self, start: datetime, end: datetime) -> pd.Series: ...
+```
+
+Use Pydantic v2 for `SignalSchema` and `SignalValue`.
+
+### 2. `signals/registry.py`
+
+A `SignalRegistry` that holds registered signal providers and provides
+lookup by name. Nothing fancy - a dict with validation that the signal
+exists when the algo asks for it.
+
+```python
+class SignalRegistry:
+    def register(self, provider: SignalProvider) -> None: ...
+    def get(self, name: str) -> SignalProvider: ...
+    def has(self, name: str) -> bool: ...
+    def list_signals(self) -> list[str]: ...
+```
+
+### 3. `signals/csv_loader.py`
+
+A `CsvSignalProvider` that loads a CSV file and serves it as a signal.
+This is the simplest way for a customer to bring their own data without
+implementing `SignalProvider` from scratch.
+
+Expected CSV format:
+```csv
+timestamp,value
+2026-03-15T00:00:00+01:00,42.31
+2026-03-15T00:15:00+01:00,41.87
+2026-03-15T00:30:00+01:00,43.05
+```
+
+Columns: `timestamp` (required, parsed as timezone-aware datetime) and
+`value` (required, parsed as float). Additional columns are ignored.
+
+The provider must support an optional `publication_offset` parameter to
+prevent look-ahead bias. The offset represents how far ahead of the
+delivery period the forecast is published. A value with timestamp T
+(the period it describes) is visible to the algo at time
+T - publication_offset (when it was published).
+
+For example, a wind forecast published 6 hours ahead would use
+`publication_offset=timedelta(hours=6)`. The forecast for the 06:00
+delivery period was published at 00:00, so it becomes visible when the
+simulated clock reaches 00:00. In code: `get_value(current_time)` returns
+the latest value where `timestamp <= current_time + publication_offset`.
+
+If `publication_offset` is not set, values are available at their timestamp
+(i.e., the value for 06:00 is available at 06:00). This is correct for
+actuals/historical data but would be look-ahead bias for forecasts. Log
+a warning when no offset is set, suggesting the user consider whether
+their data represents forecasts or actuals.
+
+```python
+signal = CsvSignalProvider(
+    name="my_wind_forecast",
+    path="data/wind_forecast_NO1.csv",
+    unit="MW",
+    description="Wind generation forecast for NO1",
+    publication_offset=timedelta(hours=6),
+)
+```
+
+### 4. Update `context.py`
+
+Add the signal methods to `TradingContext` (they exist as stubs from
+task-01, now they need real signatures):
+
+```python
+def get_signal(self, name: str) -> SignalValue: ...
+def get_signal_history(self, name: str, lookback: int) -> list[SignalValue]: ...
+```
+
+### 5. Update `engines/backtest.py`
+
+Wire signals into the `BacktestEngine`:
+
+- Accept a `signals` parameter (list of `SignalProvider` instances)
+- Register them in a `SignalRegistry`
+- When the algo calls `ctx.get_signal(name)`, look up the provider and
+  return the value for the current simulated time, respecting
+  `publication_offset`
+- When the algo calls `ctx.get_signal_history(name, lookback=N)`, return
+  the last N values up to and including the current time
+
+The backtest context implementation needs to enforce the look-ahead bias
+rule: at simulated time T, `get_signal` must not return any value whose
+publication time is after T.
+
+### 6. Update `algo.py`
+
+Add `subscribe_signal(name)` to `SimpleAlgo` and the `on_signal` hook:
+
+```python
+class SimpleAlgo:
+    def subscribe_signal(self, name: str) -> None:
+        """Register interest in a signal. Called during on_setup."""
+
+    def on_signal(self, ctx: TradingContext, name: str, value: SignalValue) -> None:
+        """Called when a subscribed signal updates. Override to react."""
+```
+
+For DA backtesting, `on_signal` is called at the start of each auction
+period with the latest signal values. The algo can also pull signals
+directly via `ctx.get_signal()` from any hook.
+
+### 7. `cli/main.py`
+
+A minimal CLI using `click`:
+
+```bash
+# Run a backtest
+nexa run examples/simple_da_algo.py \
+    --exchange nordpool \
+    --start 2026-03-01 \
+    --end 2026-03-31 \
+    --products NO1_DA \
+    --data-dir ./data \
+    --capital 100000
+
+# Output: the BacktestResult.summary() text
+```
+
+The CLI needs to:
+- Accept a path to a Python file containing an algo
+- Import the file and find the algo (look for a subclass of `SimpleAlgo`
+  or a function decorated with `@algo`)
+- Instantiate the `BacktestEngine` with the provided arguments
+- Call `.run()` and print `result.summary()`
+
+Use `importlib` to dynamically load the algo module. If the module contains
+exactly one `SimpleAlgo` subclass, use it. If it contains multiple, raise
+an error asking the user to specify which one.
+
+Register the CLI entry point in `pyproject.toml`:
+
+```toml
+[project.scripts]
+nexa = "nexa_backtest.cli.main:cli"
+```
+
+### 8. Update the example algo
+
+Update `examples/simple_da_algo.py` to use a signal:
+
+```python
+class ForecastAlgo(SimpleAlgo):
+    """Buy when DA price is below a provided price forecast."""
+
+    def on_setup(self, ctx: TradingContext) -> None:
+        self.subscribe_signal("price_forecast")
+        self.threshold = 5.0
+
+    def on_auction_open(self, ctx: TradingContext, auction: AuctionInfo) -> None:
+        forecast = ctx.get_signal("price_forecast").value
+        if forecast is not None:
+            ctx.place_order(Order.buy(
+                product=auction.product_id,
+                volume_mw=10,
+                price_eur=Decimal(str(forecast)) - Decimal(str(self.threshold)),
+            ))
+```
+
+Create a corresponding `examples/data/price_forecast_NO1.csv` with
+synthetic forecast values (slightly noisy version of the actual clearing
+prices from the test fixture, offset by the publication delay).
+
+The example should be runnable via:
+```bash
+nexa run examples/simple_da_algo.py \
+    --exchange nordpool \
+    --start 2026-03-01 \
+    --end 2026-03-31 \
+    --products NO1_DA \
+    --data-dir tests/fixtures \
+    --capital 100000
+```
+
+---
+
+## How signals are passed to the CLI
+
+For this task, signal CSV files are discovered by convention. The engine
+looks in `{data_dir}/signals/` for CSV files matching the signal name:
+
+```
+data_dir/
+  signals/
+    price_forecast.csv
+    wind_forecast.csv
+```
+
+If the algo subscribes to a signal called "price_forecast", the engine
+looks for `{data_dir}/signals/price_forecast.csv`. If the file doesn't
+exist, raise a `DataError` with a clear message.
+
+This is deliberately simple. A more flexible signal configuration (YAML
+config file, CLI flags per signal, explicit paths) is a later concern.
+
+---
+
+## Tests
+
+1. **signals/base.py**: SignalSchema and SignalValue construction
+2. **signals/csv_loader.py**:
+   - Load a valid CSV, retrieve values at known timestamps
+   - publication_offset prevents future values from being visible
+   - Missing file raises DataError
+   - Malformed CSV (missing columns, bad timestamps) raises DataError
+3. **signals/registry.py**: register, get, has, get missing raises error
+4. **backtest.py integration**: algo that uses a signal to make trading
+   decisions. Verify that the signal value influences the fills (e.g.,
+   algo only buys when forecast is below threshold, verify it doesn't
+   buy when forecast is above threshold)
+5. **cli/main.py**: test that the CLI loads an algo module, finds the
+   SimpleAlgo subclass, and runs without error (use click's CliRunner)
+6. **look-ahead bias**: test that a signal with publication_offset does
+   NOT return a value before its publication time
+
+---
+
+## What NOT to build
+
+- Built-in signal providers (DayAheadPriceSignal, WindForecastSignal, etc.)
+- Signal providers that fetch from APIs
+- YAML/JSON signal configuration
+- `nexa validate` CLI command
+- `nexa compile` CLI command
+- `nexa report` CLI command
+- Any IDC or windowed replay changes
+- HTML report generation
+
+---
+
+## Acceptance criteria
+
+1. `make ci` passes
+2. A customer can load a CSV file as a signal and use it in their algo
+3. publication_offset correctly prevents look-ahead bias
+4. `nexa run` CLI works end-to-end with the example algo
+5. The example algo uses a signal to make trading decisions and produces
+   a PnL summary
+6. All new types have type hints, frozen Pydantic models where appropriate
+7. All new public API has Google-style docstrings
diff --git a/.vscode/settings.json b/.vscode/settings.json
new file mode 100644
index 0000000..02c727a
--- /dev/null
+++ b/.vscode/settings.json
@@ -0,0 +1,29 @@
+{
+    "cSpell.words": [
+        "Backtest",
+        "Backtrader",
+        "bidkit",
+        "cython",
+        "docstrings",
+        "EPEX",
+        "intraday",
+        "marketdata",
+        "matplotlib",
+        "Mypy",
+        "nexa",
+        "Nord",
+        "nordpool",
+        "nuitka",
+        "numpy",
+        "ONNX",
+        "plotly",
+        "pyarrow",
+        "pydantic",
+        "pytest",
+        "quants",
+        "Scikit",
+        "sklearn",
+        "VWAP",
+        "Zipline"
+    ]
+}
\ No newline at end of file
diff --git a/CLAUDE.md b/CLAUDE.md
index b7140fe..38e22bb 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -10,6 +10,9 @@ with 15-minute MTU resolution natively.
 
 Part of the Phase Nexa ecosystem.
 
+See `docs/DESIGN.md` for the full architectural rationale, data volume
+estimates, hosting cost analysis, and matching engine design.
+
 ## Audience
 
 Quants, data scientists, and developers at energy trading companies who build
@@ -58,6 +61,10 @@ the design is broken.
 - Gate closure = deadline for submitting/modifying orders for a delivery period.
   Different per exchange and product type. The algo receives a
   `GateClosureWarning` event before this happens.
+- NOP = Net Open Position. The net MW across all orders for a given delivery
+  period. Tracked per product via `Position.net_mw`. Portfolio-level NOP
+  aggregation across products for the same delivery period is a stage 2
+  concern.
 
 ## Data loading strategy
 
@@ -93,10 +100,26 @@ time, not at runtime. The `nexa validate` CLI catches these before execution.
 
 Signals are any time-series data the algo consumes: weather forecasts, DA
 prices, load forecasts, gas prices, etc. Each signal has a `publication_offset`
-that prevents look-ahead bias. At simulated time T, the signal returns the
-value that was known at T, not the future value.
+that prevents look-ahead bias.
+
+`publication_offset` is a positive `timedelta` representing how far ahead of
+the delivery period the forecast was published. A value describing delivery
+period T was published at `T - publication_offset`, so it becomes visible
+when the simulated clock reaches that publication time.
+
+Example: a wind forecast with `publication_offset=timedelta(hours=6)` means
+the forecast for the 08:00 delivery period was published at 02:00. At
+simulated time 01:59, this value is not yet visible. At 02:00, it is.
+
+In code: `get_value(current_time)` returns the latest value where
+`timestamp <= current_time + publication_offset`.
 
-Custom signals implement the `SignalProvider` protocol.
+If no `publication_offset` is set, values are available at their timestamp.
+This is correct for actuals/historical data but would be look-ahead bias
+for forecasts.
+
+Custom signals implement the `SignalProvider` protocol. The simplest path
+is `CsvSignalProvider` which loads a CSV file with `timestamp,value` columns.
 
 ## ML model support
 
@@ -121,7 +144,7 @@ codes for CI integration.
 
 ## Code layout
 
-```
+```text
 src/nexa_backtest/
     __init__.py
     _version.py
@@ -146,8 +169,9 @@ src/nexa_backtest/
         eex.py           # EEX adapter
 
     signals/
-        base.py          # SignalProvider protocol, SignalSchema
+        base.py          # SignalProvider protocol, SignalSchema, SignalValue
         registry.py      # Signal registration and lookup
+        csv_loader.py    # CsvSignalProvider (load CSV as a signal)
         builtins.py      # DA price, wind, solar, load, imbalance, gas, carbon
 
     models/
@@ -182,7 +206,7 @@ src/nexa_backtest/
         nuitka_compiler.py   # Nuitka compilation for IP protection
 
     cli/
-        main.py          # CLI entry point (click or typer)
+        main.py          # CLI entry point (click)
         validate.py      # nexa validate
         run.py           # nexa run
         compile.py       # nexa compile
@@ -191,41 +215,62 @@ src/nexa_backtest/
 
 ## Implementation sequence
 
-### Stage 1: DA Auction Backtester (minimum viable)
+### Task 01: Core Types and DA Engine
 
-The smallest thing that produces useful results. Scope:
+The absolute minimum that produces a useful result. A customer writes a
+SimpleAlgo, runs BacktestEngine.run() in a Python script, gets PnL with
+VWAP comparison printed to stdout.
 
-- `types.py` with core domain types (Order, Fill, Position, MTU, PriceLevel)
-- `exceptions.py` with error hierarchy
-- `context.py` with TradingContext protocol
-- `algo.py` with SimpleAlgo base class (on_setup, on_auction_open,
-  on_fill, on_gate_closure hooks only)
+Scope:
+
+- `types.py`, `exceptions.py`, `context.py` (protocol with stubs for
+  unimplemented methods)
+- `algo.py` with SimpleAlgo (on_setup, on_auction_open, on_fill,
+  on_teardown hooks only; other hooks exist as no-ops)
 - `engines/clock.py` with SimulatedClock
 - `engines/backtest.py` with BacktestEngine (DA only, full data load)
 - `engines/matching.py` with DA auction matching (price-taker)
-- `exchanges/base.py` with ExchangeAdapter protocol
-- `exchanges/capabilities.py` with ExchangeCapabilities
-- `exchanges/nordpool.py` with Nord Pool DA adapter
+- `exchanges/base.py`, `exchanges/capabilities.py`, `exchanges/nordpool.py`
 - `data/loader.py` with ParquetLoader (DA clearing prices only)
 - `data/schema.py` with DA clearing price schema
-- `analysis/pnl.py` with basic PnL calculation
-- `analysis/vwap.py` with VWAP benchmark comparison
-- `analysis/metrics.py` with Sharpe, drawdown, win rate
-- `cli/main.py` with `nexa run` command (text output)
-- One signal: CSV-based custom signal loader (so users can bring
-  their own forecast data without implementing SignalProvider)
+- `analysis/pnl.py`, `analysis/vwap.py`, `analysis/metrics.py`
+  (total PnL, vs VWAP, win rate, volume, trade count only)
+- Synthetic test fixture generation
+- Example algo (runnable Python script, no CLI)
+
+Does NOT include: CLI, signals, Sharpe ratio, drawdown, equity curve,
+IDC, windowed replay, validation pipeline, ML models, code compilation,
+HTML reports, paper/live engines, multi-algo replay.
 
-Does NOT include: IDC, windowed replay, validation pipeline, ML models,
-code compilation, HTML reports, paper/live engines, multi-algo replay.
+### Task 02: Signals, CSV Signal Loader, and CLI
+
+Adds the signal system and CLI on top of task 01.
+
+Scope:
+
+- `signals/base.py` with SignalProvider protocol, SignalSchema, SignalValue
+- `signals/registry.py` with SignalRegistry
+- `signals/csv_loader.py` with CsvSignalProvider (CSV file as a signal,
+  with publication_offset for look-ahead bias prevention)
+- Wire signals into BacktestEngine and TradingContext
+- `subscribe_signal()` and `on_signal` hook on SimpleAlgo
+- `cli/main.py` with `nexa run` command
+- Signal CSV discovery by convention: `{data_dir}/signals/{name}.csv`
+- Updated example algo using a price forecast signal
+
+Does NOT include: built-in signal providers, YAML/JSON signal config,
+`nexa validate`, `nexa compile`, `nexa report`, IDC, HTML reports.
 
 ### Stage 2: IDC Continuous + Windowed Replay
 
 - IDC continuous matching engine (price-time priority)
 - Windowed data loading (PyArrow row groups, SlidingWindow, manifest)
 - `@algo` decorator with async event stream
-- Full signal system with publication_offset
+- Built-in signal providers (DA price, wind, solar, load, etc.)
 - EPEX SPOT and EEX exchange adapters
 - HTML report generation
+- Sharpe ratio, max drawdown, equity curve
+- Portfolio-level NOP aggregation across products
 
 ### Stage 3: Intelligence + Quality
 
@@ -259,7 +304,11 @@ product_id (str), price_eur_mwh (float64), volume_mw (float64),
 aggressor_side (str: buy/sell, optional - may not be available from all
 exchange data exports, degrade gracefully when missing)
 
-**Signals** (loaded entirely unless >1 GB):
+**Signals (CSV format for CsvSignalProvider)**:
+Columns: timestamp (timezone-aware datetime), value (float).
+Additional columns ignored.
+
+**Signals (Parquet format for built-in providers, stage 2+)**:
 Columns: published_at (datetime64[ns, UTC]), valid_from (datetime64[ns, UTC]),
 valid_to (datetime64[ns, UTC]), zone (str), value (float64), provider (str)
 
@@ -269,12 +318,14 @@ for DA data. Row groups sized at ~64 MB after compression.
 ## Dependencies
 
 Core (required):
+
 - pydantic >= 2.0
 - pyarrow (Parquet I/O and windowed replay)
 - numpy
-- click or typer (CLI)
+- click (CLI)
 
 Optional extras:
+
 - pandas (DataFrame output, installed via `pip install nexa-backtest[pandas]`)
 - onnxruntime (ML model inference, `pip install nexa-backtest[ml]`)
 - matplotlib or plotly (report charts, `pip install nexa-backtest[charts]`)
@@ -285,6 +336,12 @@ Optional extras:
 
 - **Look-ahead bias is the #1 backtesting mistake.** Signals must respect
   publication_offset. At time T, the algo can only see data published before T.
+  See the signal system section above for the exact semantics.
+- **publication_offset is a positive timedelta.** It means "published this
+  far ahead of delivery." A value for delivery period T was published at
+  T - offset. In code: `get_value(current_time)` returns the latest value
+  where `timestamp <= current_time + publication_offset`. Do not negate the
+  offset or use negative timedeltas.
 - **DA matching is price-taker only.** The algo's bid does not affect the
   clearing price. This is realistic for most participants but not for very
   large portfolios (market impact modelling is a v2 concern).
@@ -293,8 +350,9 @@ Optional extras:
 - **Window transitions happen between MTU boundaries.** Never evict or load
   data mid-event-processing.
 - **aggressor_side may not be available.** Some exchange data exports (e.g.,
-  EPEX) don't include it explicitly. The matching engine must degrade
-  gracefully when this field is missing.
+  EPEX SPOT) don't include it explicitly. The matching engine must degrade
+  gracefully when this field is missing. It may be possible to infer from
+  ActionCode/TransactionTime in some formats but this is fragile.
 - **product_id identifies the delivery period, not the trading session.**
   NO1-QH-0900 is the 09:00-09:15 delivery product. Orders for this product
   might be placed hours before delivery.
@@ -305,7 +363,7 @@ Optional extras:
 
 ## Makefile targets
 
-```
+```bash
 make install          # Install dev dependencies
 make test             # Run pytest
 make lint             # Run ruff check + ruff format --check
@@ -324,8 +382,6 @@ A feature is complete when:
 3. Tests cover the happy path and at least one error case
 4. `make ci` passes
 5. No regressions in existing tests
-6. Test coverage should be >80%
-7. If the feature adds a new algo hook or TradingContext method, the
+6. If the feature adds a new algo hook or TradingContext method, the
    protocol in context.py is updated and all three engine implementations
    (backtest, paper, live) are updated or stubbed
-8. Review the README file, check nothing major is missing, add additions if something is identified
diff --git a/README.md b/README.md
index a952645..c02d9c8 100644
--- a/README.md
+++ b/README.md
@@ -1,38 +1,27 @@
 # nexa-backtest
 
+**Energy market backtesting framework for European power trading.**
+
 [![CI](https://github.com/phasenexa/nexa-backtest/actions/workflows/ci.yml/badge.svg)](https://github.com/phasenexa/nexa-backtest/actions/workflows/ci.yml)
 [![PyPI](https://img.shields.io/pypi/v/nexa-backtest)](https://pypi.org/project/nexa-backtest/)
 [![Python](https://img.shields.io/pypi/pyversions/nexa-backtest)](https://pypi.org/project/nexa-backtest/)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 
-A backtesting framework built for European power markets. Not another equities backtester
-with energy bolted on.
-
-Handles day-ahead auctions, intraday auctions, intraday continuous trading, 15-minute MTUs,
-block bids, gate closures, and exchange-specific matching rules. Runs your algo against
-historical data and tells you: **did it make money? Did it beat VWAP?**
-
-## Why this exists
+---
 
-Every backtesting framework out there assumes continuous order books, tick-by-tick data,
-and price-time priority. Energy markets work differently. You have auctions with gate
-closures, 96 quarter-hour products per day, block bids that span multiple hours, and
-matching algorithms that clear everything at once.
+Most backtesting frameworks are built for equities. They assume continuous order books, tick-by-tick data, and price-time priority. European power markets work differently: day-ahead auctions with gate closures, intraday continuous trading, block bids, linked orders, and 15-minute MTUs.
 
-If you have tried backtesting an energy trading strategy with Zipline, Backtrader, or
-VectorBT, you know the pain. `nexa-backtest` is the tool those frameworks should have been.
+nexa-backtest is purpose-built for this. It replays historical market conditions, runs your trading algorithm against them, and answers two questions: **did it make money?** and **did it beat VWAP?**
 
-## Features
+## Key Features
 
-- **Purpose-built for energy**: DA auctions, ID auctions, IDC continuous, 15-min MTUs
-- **One interface, three engines**: same algo code for backtesting, paper trading, and live trading
-- **Two API levels**: `SimpleAlgo` for quick experiments, `@algo` decorator for full control
-- **Exchange adapters**: Nord Pool, EPEX SPOT, EEX with feature detection and validation
-- **Signal system**: weather, price forecasts, load data, carbon prices, or anything custom
-- **ML model support**: ONNX, scikit-learn, PyTorch models via `ctx.predict()`
-- **Smart data loading**: DA data loaded entirely (tiny), IDC data windowed from Parquet (scalable)
-- **Validation pipeline**: ruff + mypy + exchange feature checks + look-ahead bias detection
-- **Code protection**: Cython/Nuitka compilation for IP-sensitive hosted environments
+- **One interface, three modes.** Your algo runs identically in backtest, paper trading, and live trading. Same code, different engine underneath. Zero changes to go from replay to production.
+- **Two API levels.** `SimpleAlgo` with hooks for quick experiments. `@algo` with async event streams for full control.
+- **DA + IDC support.** Day-ahead auction matching (price-taker against historical clearing prices) and intraday continuous matching (price-time priority against historical order book).
+- **Signal system.** Plug in weather forecasts, DA prices, load forecasts, gas prices, or any time-series data. Built-in look-ahead bias prevention via `publication_offset`.
+- **ML model integration.** Register ONNX or scikit-learn models and call `ctx.predict()` from your algo.
+- **Exchange adapters.** Nord Pool, EPEX SPOT, EEX. Each adapter declares its capabilities. Use block bids on an exchange that doesn't support them? The validator catches it before you run.
+- **Efficient replay.** DA data loads entirely (it's tiny). IDC data uses windowed replay via PyArrow row groups, keeping peak memory at 200-500 MB regardless of replay period.
 
 ## Installation
 
@@ -43,67 +32,117 @@ pip install nexa-backtest
 With optional extras:
 
 ```bash
-pip install nexa-backtest[pandas]     # DataFrame output
-pip install nexa-backtest[plot]       # matplotlib/plotly charts
-pip install nexa-backtest[ml]         # ONNX + scikit-learn model support
-pip install nexa-backtest[data]       # nexa-marketdata integration
-pip install nexa-backtest[live]       # nexa-connect for live trading
-pip install nexa-backtest[all]        # everything
+pip install nexa-backtest[pandas]       # DataFrame output
+pip install nexa-backtest[ml]           # ONNX model inference
+pip install nexa-backtest[charts]       # Report charts (matplotlib/plotly)
+pip install nexa-backtest[marketdata]   # Data fetching via nexa-marketdata
+pip install nexa-backtest[live]         # Live trading via nexa-connect
+pip install nexa-backtest[all]          # Everything
 ```
 
-## Quick start
+## Quick Start
 
-### Your first backtest (20 lines)
+### Write an algo
 
 ```python
-from datetime import date
-from nexa_backtest import SimpleAlgo, TradingContext, Order, BacktestEngine
+from nexa_backtest import SimpleAlgo, TradingContext, Order
 
 class BuyBelowForecast(SimpleAlgo):
-    """Buy when DA clearing price is below the wind forecast signal."""
+    """Buy when DA clearing price is below our forecast."""
 
     def on_setup(self, ctx: TradingContext) -> None:
-        self.subscribe_signal("da_price_forecast")
-        self.subscribe_signal("wind_generation_forecast")
-
-    def on_auction_open(self, ctx: TradingContext, auction) -> None:
-        forecast = ctx.get_signal("da_price_forecast").value
-        wind = ctx.get_signal("wind_generation_forecast").value
+        self.subscribe_signal("price_forecast")
+        self.threshold = 5.0  # EUR/MWh
+
+    def on_auction_open(self, ctx: TradingContext, auction: AuctionInfo) -> None:
+        forecast = ctx.get_signal("price_forecast").value
+        ctx.place_order(Order.buy(
+            product=auction.product_id,
+            volume_mw=10,
+            price_eur=forecast - self.threshold,
+        ))
+
+    def on_fill(self, ctx: TradingContext, fill: Fill) -> None:
+        ctx.log(f"Filled {fill.volume_mw} MW @ {fill.price_eur}")
+```
 
-        if wind > 15_000:  # High wind expected, prices likely low
-            ctx.place_order(Order.buy(
-                product=auction.product_id,
-                volume_mw=10,
-                price_eur=forecast - 5.0,
-            ))
+### Backtest it
 
-    def on_fill(self, ctx: TradingContext, fill) -> None:
-        ctx.log(f"Filled {fill.volume_mw} MW @ {fill.price_eur}")
+```python
+from datetime import date
+from nexa_backtest import BacktestEngine
+from nexa_backtest.signals import CsvSignalProvider
 
-# Run it
 result = BacktestEngine(
     algo=BuyBelowForecast(),
     exchange="nordpool",
     start=date(2026, 3, 1),
     end=date(2026, 3, 31),
     products=["NO1_DA"],
+    signals=[
+        CsvSignalProvider(
+            name="price_forecast",
+            path="data/signals/price_forecast.csv",
+            unit="EUR/MWh",
+            publication_offset=timedelta(hours=12),
+        ),
+    ],
     initial_capital=100_000,
 ).run()
 
 print(result.summary())
 # Total PnL: +12,340.50 EUR
-# vs VWAP:   +3.2%
-# Sharpe:    1.4
-# Win rate:  62%
-# Max DD:    -4,200.00 EUR
+# vs VWAP: +3.2%
+# Win rate: 62%
+# Trades: 186
+```
+
+Or run from the CLI:
+
+```bash
+# Signal CSVs are discovered by convention in {data_dir}/signals/
+nexa run my_algo.py \
+    --exchange nordpool \
+    --start 2026-03-01 \
+    --end 2026-03-31 \
+    --products NO1_DA \
+    --data-dir ./data \
+    --capital 100000
+```
+
+### Paper trade it (same algo, zero changes)
+
+```python
+from nexa_backtest import PaperEngine
+
+paper = PaperEngine(
+    algo=BuyBelowForecast(),
+    exchange="nordpool",
+    products=["NO1_DA"],
+    signals=[...],
+).start()
+```
+
+### Go live (same algo, zero changes)
+
+```python
+from nexa_backtest import LiveEngine
+
+live = LiveEngine(
+    algo=BuyBelowForecast(),
+    exchange="nordpool",
+    credentials=NordPoolCredentials.from_env(),
+    products=["NO1_DA"],
+    signals=[...],
+).start()
 ```
 
-### Full control with @algo
+## The Low-Level API
 
-For quants who want to manage their own event loop:
+For quants who want full control over the event loop:
 
 ```python
-from nexa_backtest import TradingContext, Order, algo
+from nexa_backtest import TradingContext, algo
 
 @algo(name="spread_scalper", version="1.0.0")
 async def run(ctx: TradingContext) -> None:
@@ -120,7 +159,7 @@ async def run(ctx: TradingContext) -> None:
                     ))
 
             case GateClosureWarning(product_id=pid, remaining=remaining):
-                if remaining.total_seconds() < 300:
+                if remaining < timedelta(minutes=5):
                     pos = ctx.get_position(pid)
                     if pos.net_mw != 0:
                         ctx.place_order(Order.market(
@@ -129,60 +168,49 @@ async def run(ctx: TradingContext) -> None:
                         ))
 ```
 
-### Same algo, three modes
+## Signals
 
-```python
-from nexa_backtest import BacktestEngine, PaperEngine, LiveEngine
-
-algo = BuyBelowForecast()
-
-# Backtest: historical replay, simulated matching
-result = BacktestEngine(algo=algo, exchange="nordpool", ...).run()
-
-# Paper: live data, simulated matching, no real money
-paper = PaperEngine(algo=algo, exchange="nordpool", ...).start()
-
-# Live: real data, real exchange, real money
-live = LiveEngine(algo=algo, exchange="nordpool", credentials=...).start()
-```
-
-The algo code is identical in all three cases. The only thing that changes is
-which engine you pass it to.
-
-### Using signals
+Any time-series data your algo needs. Load a CSV or implement the `SignalProvider` protocol:
 
 ```python
-from nexa_backtest.signals import (
-    DayAheadPriceSignal,
-    WindForecastSignal,
-    LoadForecastSignal,
+from nexa_backtest.signals import CsvSignalProvider
+
+# Load your own forecast data from CSV
+forecast = CsvSignalProvider(
+    name="my_forecast",
+    path="data/signals/my_forecast.csv",
+    unit="EUR/MWh",
+    description="Our internal price forecast",
+    publication_offset=timedelta(hours=12),  # Published 12h before delivery
 )
 
-result = BacktestEngine(
+engine = BacktestEngine(
     algo=algo,
-    exchange="nordpool",
-    start=date(2026, 3, 1),
-    end=date(2026, 3, 31),
-    signals=[
-        DayAheadPriceSignal(zone="NO1"),
-        WindForecastSignal(zone="NO1", provider="meteomatics"),
-        LoadForecastSignal(zone="NO1"),
-    ],
-).run()
+    signals=[forecast],
+    # ...
+)
 ```
 
-Custom signals implement `SignalProvider`:
+CSV format (simple two-column minimum):
+
+```csv
+timestamp,value
+2026-03-15T00:00:00+01:00,42.31
+2026-03-15T00:15:00+01:00,41.87
+2026-03-15T00:30:00+01:00,43.05
+```
+
+For full control, implement `SignalProvider` directly:
 
 ```python
 from nexa_backtest.signals import SignalProvider, SignalSchema, SignalValue
 
-class MyForecast(SignalProvider):
-    name = "my_forecast"
+class MyModelForecast(SignalProvider):
+    name = "model_forecast"
     schema = SignalSchema(
-        name="my_forecast",
+        name="model_forecast",
         dtype=float,
         frequency=timedelta(minutes=15),
-        description="Internal price forecast",
         unit="EUR/MWh",
     )
 
@@ -196,7 +224,11 @@ class MyForecast(SignalProvider):
         )
 ```
 
-### Using ML models
+Look-ahead bias is prevented automatically. The `publication_offset` controls when forecast values become visible to your algo. A value for delivery period T with `publication_offset=timedelta(hours=6)` was published at T - 6 hours, so it only becomes visible when the simulated clock reaches that publication time.
+
+## ML Models
+
+Register models and call `ctx.predict()` from your algo:
 
 ```python
 from nexa_backtest.models import ModelRegistry, ONNXModel
@@ -204,17 +236,12 @@ from nexa_backtest.models import ModelRegistry, ONNXModel
 models = ModelRegistry()
 models.register(ONNXModel(
     name="price_predictor",
-    path="models/price_xgboost.onnx",
+    path="models/xgboost_prices.onnx",
     input_schema={"wind": float, "load": float, "hour": int},
     output_schema={"price_forecast": float},
 ))
 
-result = BacktestEngine(
-    algo=algo,
-    exchange="nordpool",
-    models=models,
-    ...
-).run()
+engine = BacktestEngine(algo=algo, models=models, ...)
 
 # In your algo:
 prediction = ctx.predict("price_predictor", {
@@ -224,165 +251,129 @@ prediction = ctx.predict("price_predictor", {
 })
 ```
 
-### Validation
+ONNX is recommended (portable, fast, no arbitrary code execution). Scikit-learn pickle is supported but flagged as a security risk in hosted environments.
+
+## Validation
 
 Catch bugs before they cost you a 10-minute backtest run:
 
 ```bash
 $ nexa validate my_algo.py --exchange nordpool
 
-Step 1/6: Syntax Check (ruff)
-  [PASS] No syntax errors
-
-Step 2/6: Type Check (mypy --strict)
-  [PASS] TradingContext protocol satisfied
-
-Step 3/6: Interface Compliance
-  [PASS] Required hooks implemented
-
-Step 4/6: Exchange Feature Compatibility
-  [PASS] All order types supported by Nord Pool
-
-Step 5/6: Look-ahead Bias Detection
-  [PASS] No future data access detected
-
-Step 6/6: Resource Safety
-  [WARN] Line 78: time.sleep() detected. Use ctx.wait() instead.
+Step 1/6: Syntax Check (ruff)          [PASS]
+Step 2/6: Type Check (mypy --strict)   [PASS]
+Step 3/6: Interface Compliance         [PASS]
+Step 4/6: Exchange Feature Compat      [FAIL]
+  - Line 42: Order.block_bid() used, but Nord Pool IDC
+    does not support block bids.
+Step 5/6: Look-ahead Bias Detection    [PASS]
+Step 6/6: Resource Safety              [PASS]
 
-Result: PASSED (1 warning)
+1 error. Fix before running.
 ```
 
-### PnL analysis
+## Historical Data
 
-```python
-result = engine.run()
-
-# Summary
-print(result.summary())
-
-# VWAP comparison
-print(result.vwap_analysis())
-# Period      | Your Avg | VWAP   | Edge    | Volume
-# 2026-03-01  | 42.30    | 43.15  | +0.85   | 120 MW
-# 2026-03-02  | 38.90    | 39.20  | +0.30   | 95 MW
-# TOTAL       | 44.20    | 44.85  | +0.65   | 3,240 MW
-
-# Export
-result.to_parquet("results/march.parquet")
-result.to_html("results/march.html")  # full report with charts
-trades_df = result.trades.to_dataframe()
-```
-
-## Historical data format
-
-Data is stored as Parquet files. DA data is tiny (load entirely). IDC data is
-large (windowed replay).
-
-```
-data/
-  nordpool/
-    da_clearing_prices/
-      NO1_2025.parquet              # ~1.7 MB, loaded entirely
-      NO1_2026.parquet
-    idc_orderbook_snapshots/
-      NO1_2026_01.parquet           # ~800 MB, windowed replay
-      NO1_2026_02.parquet
-    idc_events/
-      NO1_2026_01.parquet           # ~125 MB, windowed replay
-    idc_trades/
-      NO1_2026_01.parquet           # ~17 MB, windowed replay
-  signals/
-    wind_forecast/
-      NO1_2026.parquet              # ~50 MB, loaded entirely
-```
-
-Use `nexa-marketdata` to fetch and cache historical data, then `nexa-backtest`
-replays it:
+Backtest data is stored as Parquet files. Use `nexa-marketdata` to fetch and cache data, or bring your own:
 
 ```python
-from nexa_backtest.data import NexaMarketdataLoader
+from nexa_backtest.data import ParquetLoader, NexaMarketdataLoader
 
+# From local files
+loader = ParquetLoader(data_dir="./data/nordpool")
+
+# Or fetch via nexa-marketdata
 loader = NexaMarketdataLoader(
     source="nordpool",
     zones=["NO1", "NO2"],
     start=date(2025, 10, 1),
     end=date(2026, 3, 31),
 )
-# Downloads and caches locally as Parquet
 ```
 
-## Exchange support
-
-| Exchange | DA Auction | ID Auction | IDC Continuous | Status |
-|----------|:----------:|:----------:|:--------------:|--------|
-| Nord Pool | Yes | Yes | Yes | Planned |
-| EPEX SPOT | Yes | Yes | Yes | Planned |
-| EEX | Yes | - | - | Planned |
-
-Each exchange adapter declares its capabilities. The validation pipeline checks
-your algo uses only supported features before runtime:
+**Data format examples (shown as CSV for clarity, stored as Parquet):**
 
+DA clearing prices:
+```csv
+timestamp,zone,price_eur_mwh,volume_mwh
+2026-03-15T00:00:00+01:00,NO1,42.31,1250.5
+2026-03-15T00:15:00+01:00,NO1,41.87,1180.2
+2026-03-15T00:30:00+01:00,NO1,43.05,1310.8
 ```
-$ nexa validate my_algo.py --exchange epex_spot
 
-  [FAIL] Feature compatibility:
-    - Line 42: Order.block_bid() used, but EPEX SPOT continuous
-      does not support block bids.
-
-  1 error. Fix before running.
+IDC events:
+```csv
+timestamp,event_type,order_id,zone,product_id,side,price_eur_mwh,volume_mw,remaining_mw
+2026-03-15T08:12:03.412+01:00,new,ord-88291,NO1,NO1-QH-0900,buy,52.40,5.0,5.0
+2026-03-15T08:12:03.987+01:00,new,ord-88292,NO1,NO1-QH-0900,sell,53.10,3.0,3.0
+2026-03-15T08:12:04.201+01:00,trade,ord-88293,NO1,NO1-QH-0900,buy,53.10,2.0,0.0
 ```
 
-## Code protection
+## Code Protection
 
-For hosted environments where you do not want to share source code:
+If you don't want to share your algo's source code with a hosted platform:
 
 ```bash
-# Compile to native binary (Cython)
+# Compile to native shared library (Cython)
 $ nexa compile my_algo.py --output my_algo.so
 
-# Upload compiled binary, not source
-$ nexa upload my_algo.so --config backtest.yaml
+# Or maximum protection (Nuitka)
+$ nexa compile my_algo.py --compiler nuitka --output my_algo_binary
 ```
 
-| Approach | IP Protection | Performance | Best for |
-|----------|:------------:|:-----------:|----------|
-| Self-hosted | N/A (local) | Fastest | Most users |
-| Cython | Good | Fast | Hosted backtesting |
-| Nuitka | Very good | Fast | Maximum protection |
-| Container | Excellent | Slower | Enterprise |
-
-## Phase Nexa ecosystem
+Upload the compiled binary. We run it but can't read it.
 
-`nexa-backtest` integrates with the rest of the Phase Nexa toolkit:
+## CLI Reference
 
+```bash
+nexa run my_algo.py --exchange nordpool --start 2026-03-01 --end 2026-03-31
+nexa validate my_algo.py --exchange nordpool
+nexa compile my_algo.py --output my_algo.so
+nexa report results.parquet --format html --output report.html
 ```
-nexa-marketdata ---- fetches data ------> nexa-backtest (replay)
-nexa-bidkit -------- bid construction --> nexa-backtest (order types)
-nexa-connect ------- exchange comms ----> nexa-backtest (live engine)
-nexa-forecast ------ ML models ---------> nexa-backtest (signals/models)
-nexa-mcp ----------- LLM interface -----> nexa-backtest (run from chat)
-```
-
-Each piece is independently useful. Together they form a complete trading
-development environment.
-
-## Performance
-
-| Scenario | Time | Peak memory |
-|----------|------|-------------|
-| 1 year, DA, 1 zone | < 1 second | ~50 MB |
-| 1 year, DA, 10 zones | < 5 seconds | ~200 MB |
-| 1 year, IDC, 1 zone | 1-3 minutes | ~300 MB |
-| 1 year, IDC, 5 zones | 3-8 minutes | ~500 MB |
-| 4 algos shared, IDC, 5 zones | 5-10 minutes | ~700 MB |
 
-Times assume Parquet on local SSD/NVMe.
+## Phase Nexa Ecosystem
+
+nexa-backtest integrates with the wider Phase Nexa toolkit:
+
+| Package | Role | Integration |
+|---------|------|-------------|
+| [nexa-marketdata](https://github.com/phasenexa/nexa-marketdata) | Market data fetching | Historical data source for backtests |
+| [nexa-bidkit](https://github.com/phasenexa/nexa-bidkit) | Bid generation | Order type definitions and validation |
+| [nexa-connect](https://github.com/phasenexa/nexa-connect) | Exchange connectivity | Powers the live trading engine |
+| [nexa-forecast](https://github.com/phasenexa/nexa-forecast) | Price forecasting | ML models and signal providers |
+| [nexa-mcp](https://github.com/phasenexa/nexa-mcp) | LLM interface | Run backtests from chat |
+
+## Implementation Status
+
+| Feature | Status |
+|---------|--------|
+| Core types and protocols | Task 01 |
+| SimpleAlgo with DA hooks | Task 01 |
+| BacktestEngine (DA only) | Task 01 |
+| Nord Pool DA adapter | Task 01 |
+| Parquet data loader (DA) | Task 01 |
+| PnL + VWAP analysis | Task 01 |
+| Signal system + CSV loader | Task 02 |
+| CLI (`nexa run`) | Task 02 |
+| IDC continuous matching | Planned (Stage 2) |
+| Windowed replay | Planned (Stage 2) |
+| `@algo` low-level API | Planned (Stage 2) |
+| Built-in signal providers | Planned (Stage 2) |
+| EPEX SPOT / EEX adapters | Planned (Stage 2) |
+| HTML reports | Planned (Stage 2) |
+| Sharpe / drawdown / equity curve | Planned (Stage 2) |
+| Validation pipeline | Planned (Stage 3) |
+| ML model registry | Planned (Stage 3) |
+| Multi-algo replay | Planned (Stage 3) |
+| Paper trading engine | Planned (Stage 4) |
+| Live trading engine | Planned (Stage 4) |
+| Code compilation | Planned (Stage 4) |
 
 ## Contributing
 
-See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, coding standards,
-and the PR process.
+See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, coding standards, and PR workflow.
 
 ## License
 
-MIT
+[MIT](./LICENSE)