Feat/enhanced risk resilience telemetry v2#205
Open
VirilePeak wants to merge 24 commits intoPolymarket:mainfrom
Open
Feat/enhanced risk resilience telemetry v2#205VirilePeak wants to merge 24 commits intoPolymarket:mainfrom
VirilePeak wants to merge 24 commits intoPolymarket:mainfrom
Conversation
added 15 commits
February 17, 2026 18:22
- RiskConfig: ENV-based configuration with validation - PortfolioState: Equity, exposure, PnL tracking - Position: Simplified position representation - RiskManager: Core risk checks (sizing, limits, stops) - RiskBlockReason: Enum for telemetry - Comprehensive unit tests for all components Risk Rules Implemented: - max_risk_pct_per_trade (default 2%) - max_total_exposure_pct (default 15%) - daily_loss_limit_pct (default 5%) - max_concurrent_positions (default 5) - max_slippage_bps for stop loss (default 100) - max_spread_bps for entry filter (default 200) Feature flags: - RISK_ENABLED=1/0 to toggle all checks Tests: 20+ test cases covering sizing, blocking, exits, telemetry
… Step 2) - Enhanced Trader class with risk integration - Portfolio state tracking (equity, exposure, positions) - Risk-based position sizing before execution - Pre-trade risk checks (spread, exposure, daily loss) - Position maintenance with exit signals - Daily stats reset for tracking Key changes: - _get_portfolio_state(): Builds portfolio snapshot - _check_new_trading_day(): Resets daily limits - Risk checks before every trade execution - Exit signal detection in maintain_positions() Note: Actual execution commented out (TOS compliance) Requires: get_open_positions() in Polymarket class
Circuit Breaker: - 3 states: CLOSED, OPEN, HALF_OPEN - Configurable failure thresholds per service - Automatic recovery with half-open testing - Metrics tracking (state changes, blocked calls) - Per-service configs: Polymarket (fast), Gamma (medium), OpenAI (slow) Retry Handler: - Exponential backoff with jitter - Configurable retryable exceptions - Decorator for easy function wrapping - Pre-configured handlers for each API Features: - Thread-safe implementation - Global registry for circuit breakers - Detailed metrics for observability - Force reset for manual recovery
Metrics Collection: - TradeMetrics: Per-trade tracking (status, latency, PnL) - CycleMetrics: End-to-end cycle timing - Counter system with labels - Latency histograms (Prometheus-style) - Block reason tracking HTTP Server: - /metrics - Prometheus text format - /metrics/json - JSON format - /health - Health check - Background thread, non-blocking Features: - Thread-safe implementation - Configurable history limits - Stage timing context manager - Global singleton for easy access
Model Registry: - ENV-based configuration (DEFAULT_MODEL, FALLBACK_MODEL) - Pre-configured models: GPT-4, GPT-3.5, Claude-3 - Per-model timeouts and retry policies - Rate limit tracking LLM Client: - Automatic fallback on 429/5xx errors - Exponential backoff per model - Provider abstraction (OpenAI, Anthropic) - Detailed response metadata (latency, model used) Usage: - llm_call(messages) - simple API - client.call() with full control - Fallback chain: DEFAULT_MODEL -> FALLBACK_MODEL Environment: - DEFAULT_MODEL=gpt-4 - FALLBACK_MODEL=gpt-3.5-turbo - OPENAI_API_KEY / ANTHROPIC_API_KEY
EnhancedExecutor: - Retry/Backoff for all external calls (Polymarket, Gamma, OpenAI) - Circuit breaker protection per service - Model fallback via llm_call() - Telemetry collection at critical points - Risk gate before order submission IntegratedTrader: - Full A-D integration in one_best_trade() - Stage-by-stage latency tracking - Risk checks with telemetry - Metrics server auto-start - Backwards compatible interface Features: - All feature-flagged via ENV - Detailed block reason logging - Cycle metrics recording - Position maintenance with exit signals
Circuit Breaker Tests: - closed -> open transition on failures - open rejects calls immediately - open -> half_open after timeout - half_open -> closed on success - half_open -> open on failure Retry Handler Tests: - success without retry - retry then success - exhaust retries - no retry on non-retryable exceptions Risk Manager Tests: - position sizing caps - max exposure block - daily loss limit block - spread too wide block Model Registry Tests: - ENV loading - model config retrieval Integration Smoke Tests: - metrics counter increments - circuit breaker metrics
- Quick start guide - All ENV variables with defaults - How to run (3 modes) - How to verify (tests, metrics, circuit breaker) - Architecture diagram - File changes summary - Monitoring guide - Troubleshooting section
- Fixes NameError in test_metrics_collector_increment - All 18 tests now passing
.gitignore: - Python artifacts (__pycache__, *.pyc) - Environment files (.env) - Credentials (gdrive_credentials.json, oauth_credentials.json, etc.) - Logs and local databases - IDE files CI Workflow: - Run on push/PR to main and feat/* branches - Python 3.12 setup - Install dependencies - Run pytest on test_integration.py - Secret scanning check
PolymarketWSClient: - Connects to wss://ws-subscriptions-clob.polymarket.com/ws/market - Normalizes events: Quote, Trade, Orderbook - In-memory state: latest_quote per market - Auto-reconnect with exponential backoff - Thread-safe implementation Health Endpoint: - /market-data/health - Overall health status - /market-data/status - Detailed status + quotes - Tracks: connected, last_message_age_s, subscriptions Features: - Feature-flagged: WS_ENABLED - Falls back to HTTP if WS disabled - Configurable reconnect interval - Singleton for easy access
OrderBook: - Level-2 bids/asks with PriceLevel - Best bid/ask, spread, mid, microprice - Depth within X bps (1bp, 5bp) - Imbalance calculation (-1 to 1) - Volatility proxy from book shape LiquidityGate: - max_spread_bps check - min_depth_1bp check - min_depth_5bp check - max_book_age_s staleness check OrderBookManager: - Multi-market orderbook storage - Snapshot + delta updates - Trade history tracking - VWAP calculation - Liquidity check per market Features: - Feature-flagged: L2_ENABLED - Thread-safe implementation - Singleton for easy access
…ck G) ExecutionEngine: - Pre-trade checks: risk gate + liquidity gate + staleness gate - Order types: maker, taker, smart (maker -> taker fallback) - Iceberg splitting for large orders - Post-trade: slippage tracking, fill verification - Retry logic with max_retries config ExecutionConfig: - order_type: maker/taker/smart - max_slippage_bps, max_order_age_s - iceberg_threshold, iceberg_parts - Feature flags: enabled, verify_fills, retry_on_fail ExecutionResult: - success, filled_size, avg_price - slippage_bps, fees, latency_ms - retries, error tracking Features: - Feature-flagged: EXECUTION_ENABLED - VWAP calculation for multi-part fills - Slippage statistics tracking - Singleton for easy access
Trader Integration: - MarketDataEnhancedTrader with feature flags - WS_ENABLED, L2_ENABLED, EXECUTION_ENABLED (default OFF) - Automatic fallback to HTTP if WS unavailable - Pre-trade gates: risk + liquidity + staleness - Smart execution when EXECUTION_ENABLED=1 Tests (test_market_data.py): - Block E: WS event parsing, health state - Block F: Orderbook snapshot, delta, computed features - Block G: Execution decisions (stale, spread, depth) README Update: - 5m-ready mode section - ENV flags with conservative defaults - How to run + verify - Fallback behavior docs
- Add pytest.approx for floating point comparisons - Fix LiquidityGate test spreads (tight vs wide) - Fix RiskManager tests with relaxed spread limits - Fix Execution tests with proper gate ordering - Add missing import in test_market_data.py - Fix health_server.py import path
added 9 commits
February 17, 2026 20:14
- Prevents catching SystemExit, KeyboardInterrupt - Cleaner error handling - Best practice for production code
- one_best_trade() called itself on exception - Could cause stack overflow - Now re-raises exception instead - Proper retry logic should be in caller
- get_execution_engine() was not thread-safe - Could create multiple instances in race condition - Now uses double-checked locking pattern
- _latest_quotes could grow unbounded with many subscriptions - Now limited to 1000 entries - Auto-cleanup of unsubscribed markets when limit reached
- portfolio.equity could be 0 or negative - Would cause ZeroDivisionError - Now checks equity before division and blocks trade
- Add ping_interval and ping_timeout to prevent hanging - Close event loop on thread exit to prevent resource leak
- _connected was set without lock (race with subscribe) - _connected not reset on disconnect - Now properly synchronized with _lock
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.