Status: In Progress (2/10 Complete)
Goal: Fix critical bugs, remove technical debt, improve testability
Duration: 10 PRs, estimated 2-3 days of focused work
Last Updated: 2026-01-11
Completed: PR #1 ✅, PR #2 ✅
Each PR is self-contained and can be executed by a fresh agent with full context.
PRs are ordered by dependency and risk (critical bugs first).
Agent Context Required for Each PR:
- Full project structure (from AI_CONTEXT.md)
- Specific files to edit (provided in each PR)
- Acceptance criteria (must pass before PR is complete)
Completed: 2026-01-11
Status: ✅ All tests passing (46/46)
Testing Guide: See TESTING_PR1.md
demo_latest.py passes pd.Series to plot_equity_curve() which expects CSV path string.
Result: Every demo crashes at plotting step.
Option A — Updated plot_equity_curve() to accept both Series and str:
def plot_equity_curve(
equity: pd.Series | str,
output_file: str,
title: str
) -> None:
"""Plot equity curve from Series or CSV file."""
if isinstance(equity, str):
equity_df = pd.read_csv(equity, index_col=0, parse_dates=True)
equity_series = equity_df["equity"]
else:
equity_series = equity
# ... plotting codesrc/plotting.py— Added dual input supporttests/test_plotting.py— Created 7 comprehensive testsscripts/test_plotting.py— Enhanced verification scriptscripts/pr1_synthetic_demo.py— New end-to-end demo
# All 46 tests pass
pytest tests/ -v
# Quick verification (3 commands)
pytest tests/test_plotting.py -v
python scripts/test_plotting.py
python scripts/pr1_synthetic_demo.pyLOW — Isolated change, comprehensive tests, backward compatible
Completed: 2026-01-11
Status: ✅ 3 tests added, all passing (49 total)
Testing Guide: See TESTING_PR2.md
No test verifies fetch_yahoo_data() works end-to-end with Yahoo API response.
Tests only cover helper functions, not the integrated flow.
Added TestYahooIntegration class with 3 comprehensive tests:
1. test_fetch_yahoo_data_full_flow
- Mocks crumb request and chart data response
- Verifies full pipeline: fetch → parse → cache → return
- Validates cache file creation
- Confirms data format and values
2. test_fetch_yahoo_data_incremental_update
- Pre-populates cache with old data
- Fetches overlapping date range
- Verifies old + new data merge correctly
- Tests incremental cache strategy
3. test_fetch_yahoo_data_with_cache_disabled
- Pre-populates cache with dummy data
- Fetches with
use_cache=False - Confirms fresh data is fetched (cache bypassed)
tests/test_yahoo.py— AddedTestYahooIntegrationclass (3 new tests)
# Run new integration tests
pytest tests/test_yahoo.py::TestYahooIntegration -v
# Run all Yahoo tests (16 total)
pytest tests/test_yahoo.py -v
# Run all project tests (49 total)
pytest tests/ -vAll tests are 100% offline — No real network calls, using mocked responses.
This validates the code works correctly even when Yahoo API returns 429 errors.
LOW — Pure test addition, no production code changes
class TestYahooIntegration:
"""End-to-end tests with mocked network."""
@pytest.fixture
def sample_response(self):
fixture_path = Path(__file__).parent / "fixtures" / "yahoo_sample.json"
with open(fixture_path) as f:
return json.load(f)
def test_fetch_yahoo_data_full_flow(self, sample_response):
"""Test full flow: fetch → parse → cache → return."""
with tempfile.TemporaryDirectory() as tmpdir:
with patch("src.yahoo._get_data_dir") as mock_dir:
mock_dir.return_value = Path(tmpdir)
# Mock urlopen to return sample response
with patch("src.yahoo.urlopen") as mock_urlopen:
# First call: crumb
crumb_response = Mock()
crumb_response.read.return_value = b"test_crumb"
# Second call: chart data
chart_response = Mock()
chart_response.read.return_value = json.dumps(sample_response).encode()
mock_urlopen.return_value.__enter__.side_effect = [
crumb_response,
chart_response
]
# Fetch data
prices = fetch_yahoo_data(
"SPY",
start_date=datetime(2024, 1, 1),
end_date=datetime(2024, 1, 5),
use_cache=True
)
# Verify results
assert len(prices) == 5
assert prices.iloc[0] == 385.5
assert prices.iloc[-1] == 390.5
# Verify cache was created
cache_file = Path(tmpdir) / "SPY_1d_persistent.csv"
assert cache_file.exists()pytest tests/test_yahoo.py::TestYahooIntegration::test_fetch_yahoo_data_full_flow -v
# Must pass offline, no network callsLOW — Pure test addition, no production code changes
Legacy cache functions (_load_from_cache, _save_to_cache, _get_cache_path) exist but are never called.
Confuses developers about which cache system to use.
src/yahoo.py(remove 3 functions)tests/test_yahoo.py(remove legacy cache tests)
Delete from src/yahoo.py:
_get_cache_path()(lines 48-55)_load_from_cache()(lines 166-177)_save_to_cache()(lines 180-189)
Delete from tests/test_yahoo.py:
test_save_and_load_cache()inTestYahooDataFetching
Verify no references remain:
grep -r "_get_cache_path\|_load_from_cache\|_save_to_cache" src/ tests/
# Should return empty# All tests still pass
pytest tests/test_yahoo.py -v
# Persistent cache tests still work
pytest tests/test_yahoo.py::TestPersistentCache -v
# Demo still works
python scripts/demo_latest.py --lookback 30LOW — Dead code removal, verified by grep
_load_persistent_cache() silently returns None on any exception.
Corrupted cache files cause full re-downloads with no warning.
src/yahoo.py(add logging and backup)
import logging
logger = logging.getLogger(__name__)
def _load_persistent_cache(ticker: str, interval: str = "1d") -> pd.Series | None:
"""Load all cached data for a ticker from persistent cache."""
cache_path = _get_persistent_cache_path(ticker, interval)
if cache_path.exists():
try:
df = pd.read_csv(cache_path, parse_dates=["datetime"])
df = df.sort_values("datetime")
return pd.Series(df["close"].values, index=df["datetime"], name=ticker)
except Exception as e:
# Rename corrupted cache and log warning
backup_path = cache_path.with_suffix(".csv.bak")
cache_path.rename(backup_path)
logger.warning(
f"Corrupted cache for {ticker}: {e}. "
f"Renamed to {backup_path}. Re-fetching..."
)
return None
return None# Corrupt cache file
echo "garbage" > data/SPY_1d_persistent.csv
# Run fetch (should log warning and recover)
python -c "from src.yahoo import fetch_yahoo_data; fetch_yahoo_data('SPY', lookback_days=5)"
# Verify backup was created
ls data/SPY_1d_persistent.csv.bak
# Verify new cache was created
head data/SPY_1d_persistent.csvLOW — Defensive code addition, improves reliability
Crumb is fetched from /v1/test/getcrumb without session cookies.
May fail with 401/403 errors intermittently.
src/yahoo.py(test and potentially remove crumb)
Test if Yahoo API works without crumb:
# Remove crumb code temporarily and test
params = {
"period1": start_ts,
"period2": end_ts,
"interval": interval,
"events": "history",
}
# No crumb parameter
url = f"https://query1.finance.yahoo.com/v8/finance/chart/{ticker}?{urlencode(params)}"If works without crumb: Remove crumb fetching code entirely
If doesn't work: Add proper session cookie handling
# Run 10 times, must succeed 10/10 times
for i in {1..10}; do
python -c "from src.yahoo import fetch_yahoo_data; print(fetch_yahoo_data('SPY', lookback_days=5))"
doneMEDIUM — May require API testing, but reversible
Every fetch_yahoo_data() call creates 2 new HTTP connections (crumb + data).
No connection pooling, burns rate limits faster.
src/yahoo.py(add session parameter)scripts/demo_latest.py(create session once)
# src/yahoo.py
from urllib.request import build_opener, HTTPCookieProcessor
from http.cookiejar import CookieJar
def fetch_yahoo_data(
ticker: str,
...,
session: urllib.request.OpenerDirector | None = None,
) -> pd.Series:
"""Fetch with optional session reuse."""
# If no session, create disposable opener
if session is None:
cookie_jar = CookieJar()
session = build_opener(HTTPCookieProcessor(cookie_jar))
# Use session.open() instead of urlopen()
...# scripts/demo_latest.py
from urllib.request import build_opener, HTTPCookieProcessor
from http.cookiejar import CookieJar
def run_demo(...):
# Create session once
cookie_jar = CookieJar()
session = build_opener(HTTPCookieProcessor(cookie_jar))
# Reuse for both fetches
es_prices = fetch_yahoo_data("ES=F", ..., session=session)
spy_prices = fetch_yahoo_data("SPY", ..., session=session)# Run demo, verify only 1 TCP connection established
python scripts/demo_latest.py --lookback 30
# Check logs for "Reusing session" messageMEDIUM — Affects network layer, but backwards compatible (session is optional)
compute_spread_and_zscore() uses manual loop: for i in range(window-1, len(es))
10-100x slower than pandas rolling operations on large datasets.
src/signal.py(replace loop with pandas)tests/test_signal.py(add performance test)
# Replace manual loop with pandas rolling
log_es = np.log(es)
log_spy = np.log(spy)
# Use pandas rolling covariance and variance
rolling_cov = log_es.rolling(window).cov(log_spy, ddof=0)
rolling_var = log_spy.rolling(window).var(ddof=0)
beta = rolling_cov / rolling_var
beta = beta.fillna(0.0) # Handle zero variance
spread = log_es - beta * log_spy# Add benchmark test
pytest tests/test_signal.py::test_v2_vectorized_matches_loop -v
pytest tests/test_signal.py::test_v2_performance -v
# Performance test: 10,000 bars must complete <1 secLOW — Pure optimization, numerical equivalence tested
No CI test verifies the full demo pipeline works end-to-end.
Can't detect regressions before deployment.
tests/test_integration.py(new file)
"""Integration tests for full pipeline."""
import tempfile
from pathlib import Path
from unittest.mock import patch
import pandas as pd
import pytest
from scripts.demo_latest import run_demo
class TestDemoIntegration:
def test_demo_latest_smoke(self):
"""Smoke test: demo runs without errors."""
with tempfile.TemporaryDirectory() as tmpdir:
# Mock data directory
with patch("src.yahoo.fetch_yahoo_data") as mock_fetch:
# Return fake data
dates = pd.date_range("2024-01-01", periods=30, freq="D")
mock_fetch.return_value = pd.Series(
range(100, 130), index=dates, name="TEST"
)
# Run demo
run_demo(lookback_days=30, signal_version="v2", show_plot=False)
# Verify results created
assert len(list(Path("results").glob("demo_equity_*.csv"))) > 0
assert len(list(Path("results").glob("demo_summary_*.csv"))) > 0pytest tests/test_integration.py::TestDemoIntegration::test_demo_latest_smoke -v
# Must complete in <5 seconds (offline)LOW — Pure test addition
Plotting API is confusing — plot_equity_curve(csv_path, out_path, title) but different callers have different needs.
src/plotting.py(add deprecation)
import warnings
def plot_equity_curve_from_csv(csv_path: str, out_path: str, title: str) -> None:
"""Plot equity curve from CSV file (new explicit name)."""
# Current implementation
...
def plot_equity_curve(
equity: pd.Series | str,
output_file: str,
title: str
) -> None:
"""Plot equity curve from Series or CSV (unified interface)."""
if isinstance(equity, str):
# Delegate to CSV function
plot_equity_curve_from_csv(equity, output_file, title)
else:
# Plot Series directly
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(equity.index, equity.values, linewidth=2, color="#2E86AB")
# ... styling ...
plt.savefig(output_file, dpi=150, bbox_inches="tight")
plt.close(fig)# Test both paths
from src.plotting import plot_equity_curve
import pandas as pd
# Path 1: Series
equity = pd.Series([0, 0.01, 0.02], index=pd.date_range('2024-01-01', periods=3))
plot_equity_curve(equity, "test_series.png", "Test")
# Path 2: CSV
plot_equity_curve("results/test_equity_curve_v2.csv", "test_csv.png", "Test")
# Both must create PNG filesLOW — Backwards compatible extension
Signal functions lack edge case coverage:
- Zero variance data
- All-NaN inputs
- Single-bar window
- Misaligned timestamps
tests/test_signal.py(add test class)
class TestSignalEdgeCases:
"""Edge case tests for signal generation."""
def test_v1_zero_variance(self):
"""Test V1 with constant prices (zero variance)."""
dates = pd.date_range("2024-01-01", periods=100, freq="D")
es = pd.Series([100.0] * 100, index=dates)
spy = pd.Series([50.0] * 100, index=dates)
z = compute_zscore(es, spy, window=20)
# All z-scores should be 0 (not NaN or inf)
assert (z[19:] == 0.0).all()
def test_v2_zero_variance(self):
"""Test V2 with constant prices."""
# Similar test for V2
...
def test_v1_single_datapoint_after_warmup(self):
"""Test V1 with minimal data (window + 1 points)."""
dates = pd.date_range("2024-01-01", periods=21, freq="D")
es = pd.Series(range(100, 121), index=dates)
spy = pd.Series(range(50, 71), index=dates)
z = compute_zscore(es, spy, window=20)
# Only last point should be valid
assert pd.isna(z[:19]).all()
assert not pd.isna(z[19])
def test_misaligned_timestamps(self):
"""Test automatic alignment when ES/SPY have different timestamps."""
es_dates = pd.date_range("2024-01-01", periods=100, freq="D")
spy_dates = pd.date_range("2024-01-02", periods=100, freq="D") # 1 day offset
es = pd.Series(range(100, 200), index=es_dates)
spy = pd.Series(range(50, 150), index=spy_dates)
z = compute_zscore(es, spy, window=20)
# Should have 99 aligned bars (inner join)
assert len(z) == 99pytest tests/test_signal.py::TestSignalEdgeCases -v
# Must pass all 8 edge case testsLOW — Pure test addition, validates existing behavior
Week 1: Critical Bugs & Dead Code
- PR #1: Fix Plotting Crash (1 hour)
- PR #2: Add Yahoo Integration Test (2 hours)
- PR #3: Remove Dead Cache Code (1 hour)
- PR #4: Add Cache Corruption Handling (1 hour)
Week 2: Performance & Quality 5. PR #5: Improve Crumb Mechanism (2 hours) 6. PR #6: Add Session Reuse (2 hours) 7. PR #7: Vectorize V2 Signal (2 hours) 8. PR #8: Add Smoke Test (1 hour)
Week 3: Polish 9. PR #9: Unify Plotting API (1 hour) 10. PR #10: Add Edge Case Tests (2 hours)
For each PR, provide agent with:
**Task:** [PR Title]
**Context:** See AI_CONTEXT.md for full project structure
**Files to Edit:**
- [list of files]
**Changes:**
[detailed changes from PR description]
**Acceptance Criteria:**
[bash commands from PR]
**Verification:**
After changes, run:
1. pytest tests/ -v
2. python scripts/demo_latest.py --lookback 30
3. Verify acceptance criteria pass- ✅ All demos run without crashes
- ✅ 100% offline testability
- ✅ Zero dead code in src/
- ✅ Sub-second V2 signal computation
- ✅ Robust error handling with logging
- ✅ Clean, professional codebase