|
| 1 | +# Task 02: Signals, CSV Signal Loader, and CLI |
| 2 | + |
| 3 | +## Goal |
| 4 | + |
| 5 | +Add the signal system and a basic CLI so a customer can bring their own |
| 6 | +forecast data (as CSV), reference it from their algo, and run a backtest |
| 7 | +from the command line rather than writing a Python script. |
| 8 | + |
| 9 | +This builds on task-01. All existing types, the DA matching engine, and |
| 10 | +the PnL/VWAP analysis remain unchanged. |
| 11 | + |
| 12 | +--- |
| 13 | + |
| 14 | +## What to build |
| 15 | + |
| 16 | +### 1. `signals/base.py` |
| 17 | + |
| 18 | +The `SignalProvider` protocol and supporting types: |
| 19 | + |
| 20 | +```python |
| 21 | +class SignalSchema: |
| 22 | + """Describes what a signal provides.""" |
| 23 | + name: str |
| 24 | + dtype: type # float, int, str |
| 25 | + frequency: timedelta # How often it updates |
| 26 | + description: str |
| 27 | + unit: str # "EUR/MWh", "MW", "m/s", "celsius", etc. |
| 28 | + |
| 29 | +class SignalValue: |
| 30 | + """A single signal observation.""" |
| 31 | + timestamp: datetime # When this value is valid for |
| 32 | + value: Any # The actual value |
| 33 | + |
| 34 | +class SignalProvider(Protocol): |
| 35 | + @property |
| 36 | + def name(self) -> str: ... |
| 37 | + |
| 38 | + @property |
| 39 | + def schema(self) -> SignalSchema: ... |
| 40 | + |
| 41 | + def get_value(self, timestamp: datetime) -> SignalValue: ... |
| 42 | + def get_range(self, start: datetime, end: datetime) -> pd.Series: ... |
| 43 | +``` |
| 44 | + |
| 45 | +Use Pydantic v2 for `SignalSchema` and `SignalValue`. |
| 46 | + |
| 47 | +### 2. `signals/registry.py` |
| 48 | + |
| 49 | +A `SignalRegistry` that holds registered signal providers and provides |
| 50 | +lookup by name. Nothing fancy - a dict with validation that the signal |
| 51 | +exists when the algo asks for it. |
| 52 | + |
| 53 | +```python |
| 54 | +class SignalRegistry: |
| 55 | + def register(self, provider: SignalProvider) -> None: ... |
| 56 | + def get(self, name: str) -> SignalProvider: ... |
| 57 | + def has(self, name: str) -> bool: ... |
| 58 | + def list_signals(self) -> list[str]: ... |
| 59 | +``` |
| 60 | + |
| 61 | +### 3. `signals/csv_loader.py` |
| 62 | + |
| 63 | +A `CsvSignalProvider` that loads a CSV file and serves it as a signal. |
| 64 | +This is the simplest way for a customer to bring their own data without |
| 65 | +implementing `SignalProvider` from scratch. |
| 66 | + |
| 67 | +Expected CSV format: |
| 68 | +```csv |
| 69 | +timestamp,value |
| 70 | +2026-03-15T00:00:00+01:00,42.31 |
| 71 | +2026-03-15T00:15:00+01:00,41.87 |
| 72 | +2026-03-15T00:30:00+01:00,43.05 |
| 73 | +``` |
| 74 | + |
| 75 | +Columns: `timestamp` (required, parsed as timezone-aware datetime) and |
| 76 | +`value` (required, parsed as float). Additional columns are ignored. |
| 77 | + |
| 78 | +The provider must support an optional `publication_offset` parameter to |
| 79 | +prevent look-ahead bias. The offset represents how far ahead of the |
| 80 | +delivery period the forecast is published. A value with timestamp T |
| 81 | +(the period it describes) is visible to the algo at time |
| 82 | +T - publication_offset (when it was published). |
| 83 | + |
| 84 | +For example, a wind forecast published 6 hours ahead would use |
| 85 | +`publication_offset=timedelta(hours=6)`. The forecast for the 06:00 |
| 86 | +delivery period was published at 00:00, so it becomes visible when the |
| 87 | +simulated clock reaches 00:00. In code: `get_value(current_time)` returns |
| 88 | +the latest value where `timestamp <= current_time + publication_offset`. |
| 89 | + |
| 90 | +If `publication_offset` is not set, values are available at their timestamp |
| 91 | +(i.e., the value for 06:00 is available at 06:00). This is correct for |
| 92 | +actuals/historical data but would be look-ahead bias for forecasts. Log |
| 93 | +a warning when no offset is set, suggesting the user consider whether |
| 94 | +their data represents forecasts or actuals. |
| 95 | + |
| 96 | +```python |
| 97 | +signal = CsvSignalProvider( |
| 98 | + name="my_wind_forecast", |
| 99 | + path="data/wind_forecast_NO1.csv", |
| 100 | + unit="MW", |
| 101 | + description="Wind generation forecast for NO1", |
| 102 | + publication_offset=timedelta(hours=6), |
| 103 | +) |
| 104 | +``` |
| 105 | + |
| 106 | +### 4. Update `context.py` |
| 107 | + |
| 108 | +Add the signal methods to `TradingContext` (they exist as stubs from |
| 109 | +task-01, now they need real signatures): |
| 110 | + |
| 111 | +```python |
| 112 | +def get_signal(self, name: str) -> SignalValue: ... |
| 113 | +def get_signal_history(self, name: str, lookback: int) -> list[SignalValue]: ... |
| 114 | +``` |
| 115 | + |
| 116 | +### 5. Update `engines/backtest.py` |
| 117 | + |
| 118 | +Wire signals into the `BacktestEngine`: |
| 119 | + |
| 120 | +- Accept a `signals` parameter (list of `SignalProvider` instances) |
| 121 | +- Register them in a `SignalRegistry` |
| 122 | +- When the algo calls `ctx.get_signal(name)`, look up the provider and |
| 123 | + return the value for the current simulated time, respecting |
| 124 | + `publication_offset` |
| 125 | +- When the algo calls `ctx.get_signal_history(name, lookback=N)`, return |
| 126 | + the last N values up to and including the current time |
| 127 | + |
| 128 | +The backtest context implementation needs to enforce the look-ahead bias |
| 129 | +rule: at simulated time T, `get_signal` must not return any value whose |
| 130 | +publication time is after T. |
| 131 | + |
| 132 | +### 6. Update `algo.py` |
| 133 | + |
| 134 | +Add `subscribe_signal(name)` to `SimpleAlgo` and the `on_signal` hook: |
| 135 | + |
| 136 | +```python |
| 137 | +class SimpleAlgo: |
| 138 | + def subscribe_signal(self, name: str) -> None: |
| 139 | + """Register interest in a signal. Called during on_setup.""" |
| 140 | + |
| 141 | + def on_signal(self, ctx: TradingContext, name: str, value: SignalValue) -> None: |
| 142 | + """Called when a subscribed signal updates. Override to react.""" |
| 143 | +``` |
| 144 | + |
| 145 | +For DA backtesting, `on_signal` is called at the start of each auction |
| 146 | +period with the latest signal values. The algo can also pull signals |
| 147 | +directly via `ctx.get_signal()` from any hook. |
| 148 | + |
| 149 | +### 7. `cli/main.py` |
| 150 | + |
| 151 | +A minimal CLI using `click`: |
| 152 | + |
| 153 | +```bash |
| 154 | +# Run a backtest |
| 155 | +nexa run examples/simple_da_algo.py \ |
| 156 | + --exchange nordpool \ |
| 157 | + --start 2026-03-01 \ |
| 158 | + --end 2026-03-31 \ |
| 159 | + --products NO1_DA \ |
| 160 | + --data-dir ./data \ |
| 161 | + --capital 100000 |
| 162 | + |
| 163 | +# Output: the BacktestResult.summary() text |
| 164 | +``` |
| 165 | + |
| 166 | +The CLI needs to: |
| 167 | +- Accept a path to a Python file containing an algo |
| 168 | +- Import the file and find the algo (look for a subclass of `SimpleAlgo` |
| 169 | + or a function decorated with `@algo`) |
| 170 | +- Instantiate the `BacktestEngine` with the provided arguments |
| 171 | +- Call `.run()` and print `result.summary()` |
| 172 | + |
| 173 | +Use `importlib` to dynamically load the algo module. If the module contains |
| 174 | +exactly one `SimpleAlgo` subclass, use it. If it contains multiple, raise |
| 175 | +an error asking the user to specify which one. |
| 176 | + |
| 177 | +Register the CLI entry point in `pyproject.toml`: |
| 178 | + |
| 179 | +```toml |
| 180 | +[project.scripts] |
| 181 | +nexa = "nexa_backtest.cli.main:cli" |
| 182 | +``` |
| 183 | + |
| 184 | +### 8. Update the example algo |
| 185 | + |
| 186 | +Update `examples/simple_da_algo.py` to use a signal: |
| 187 | + |
| 188 | +```python |
| 189 | +class ForecastAlgo(SimpleAlgo): |
| 190 | + """Buy when DA price is below a provided price forecast.""" |
| 191 | + |
| 192 | + def on_setup(self, ctx: TradingContext) -> None: |
| 193 | + self.subscribe_signal("price_forecast") |
| 194 | + self.threshold = 5.0 |
| 195 | + |
| 196 | + def on_auction_open(self, ctx: TradingContext, auction: AuctionInfo) -> None: |
| 197 | + forecast = ctx.get_signal("price_forecast").value |
| 198 | + if forecast is not None: |
| 199 | + ctx.place_order(Order.buy( |
| 200 | + product=auction.product_id, |
| 201 | + volume_mw=10, |
| 202 | + price_eur=Decimal(str(forecast)) - Decimal(str(self.threshold)), |
| 203 | + )) |
| 204 | +``` |
| 205 | + |
| 206 | +Create a corresponding `examples/data/price_forecast_NO1.csv` with |
| 207 | +synthetic forecast values (slightly noisy version of the actual clearing |
| 208 | +prices from the test fixture, offset by the publication delay). |
| 209 | + |
| 210 | +The example should be runnable via: |
| 211 | +```bash |
| 212 | +nexa run examples/simple_da_algo.py \ |
| 213 | + --exchange nordpool \ |
| 214 | + --start 2026-03-01 \ |
| 215 | + --end 2026-03-31 \ |
| 216 | + --products NO1_DA \ |
| 217 | + --data-dir tests/fixtures \ |
| 218 | + --capital 100000 |
| 219 | +``` |
| 220 | + |
| 221 | +--- |
| 222 | + |
| 223 | +## How signals are passed to the CLI |
| 224 | + |
| 225 | +For this task, signal CSV files are discovered by convention. The engine |
| 226 | +looks in `{data_dir}/signals/` for CSV files matching the signal name: |
| 227 | + |
| 228 | +``` |
| 229 | +data_dir/ |
| 230 | + signals/ |
| 231 | + price_forecast.csv |
| 232 | + wind_forecast.csv |
| 233 | +``` |
| 234 | + |
| 235 | +If the algo subscribes to a signal called "price_forecast", the engine |
| 236 | +looks for `{data_dir}/signals/price_forecast.csv`. If the file doesn't |
| 237 | +exist, raise a `DataError` with a clear message. |
| 238 | + |
| 239 | +This is deliberately simple. A more flexible signal configuration (YAML |
| 240 | +config file, CLI flags per signal, explicit paths) is a later concern. |
| 241 | + |
| 242 | +--- |
| 243 | + |
| 244 | +## Tests |
| 245 | + |
| 246 | +1. **signals/base.py**: SignalSchema and SignalValue construction |
| 247 | +2. **signals/csv_loader.py**: |
| 248 | + - Load a valid CSV, retrieve values at known timestamps |
| 249 | + - publication_offset prevents future values from being visible |
| 250 | + - Missing file raises DataError |
| 251 | + - Malformed CSV (missing columns, bad timestamps) raises DataError |
| 252 | +3. **signals/registry.py**: register, get, has, get missing raises error |
| 253 | +4. **backtest.py integration**: algo that uses a signal to make trading |
| 254 | + decisions. Verify that the signal value influences the fills (e.g., |
| 255 | + algo only buys when forecast is below threshold, verify it doesn't |
| 256 | + buy when forecast is above threshold) |
| 257 | +5. **cli/main.py**: test that the CLI loads an algo module, finds the |
| 258 | + SimpleAlgo subclass, and runs without error (use click's CliRunner) |
| 259 | +6. **look-ahead bias**: test that a signal with publication_offset does |
| 260 | + NOT return a value before its publication time |
| 261 | + |
| 262 | +--- |
| 263 | + |
| 264 | +## What NOT to build |
| 265 | + |
| 266 | +- Built-in signal providers (DayAheadPriceSignal, WindForecastSignal, etc.) |
| 267 | +- Signal providers that fetch from APIs |
| 268 | +- YAML/JSON signal configuration |
| 269 | +- `nexa validate` CLI command |
| 270 | +- `nexa compile` CLI command |
| 271 | +- `nexa report` CLI command |
| 272 | +- Any IDC or windowed replay changes |
| 273 | +- HTML report generation |
| 274 | + |
| 275 | +--- |
| 276 | + |
| 277 | +## Acceptance criteria |
| 278 | + |
| 279 | +1. `make ci` passes |
| 280 | +2. A customer can load a CSV file as a signal and use it in their algo |
| 281 | +3. publication_offset correctly prevents look-ahead bias |
| 282 | +4. `nexa run` CLI works end-to-end with the example algo |
| 283 | +5. The example algo uses a signal to make trading decisions and produces |
| 284 | + a PnL summary |
| 285 | +6. All new types have type hints, frozen Pydantic models where appropriate |
| 286 | +7. All new public API has Google-style docstrings |
0 commit comments