Version: 2.0 (Modular Package) Date: 2025-11-02 Migration Status: Weeks 1-5 Complete
Module: mega_backtestAI1h.backtesting.vectorbt_engine
High-performance vectorized backtesting engine using VectorBT library.
class VectorBTEngine:
"""Vectorized backtesting engine using VectorBT.
~100x faster than loop-based backtesting through NumPy vectorization.
"""
def __init__(
self,
initial_capital: float = 10000.0,
fees: float = 0.001,
slippage: float = 0.001
):
"""Initialize VectorBT backtesting engine.
Args:
initial_capital: Starting capital in dollars
fees: Transaction fees as decimal (0.001 = 0.1%)
slippage: Slippage as decimal (0.001 = 0.1%)
"""Calculate all technical indicators for strategy development.
Parameters:
data(pd.DataFrame): OHLCV market data with DatetimeIndex
Returns:
- Dict[str, np.ndarray]: Dictionary of indicator arrays
Available Indicators:
- RSI (14, 21 periods)
- Bollinger Bands (20, 2.0 std)
- MACD (12, 26, 9)
- ATR (14)
- Volume SMA (20)
- Price SMA (20, 50)
- EMA (12, 26)
Example:
engine = VectorBTEngine()
indicators = engine.calculate_indicators(data)
rsi_14 = indicators['rsi_14']
bb_upper = indicators['bb_upper']Execute vectorized backtest with given signals and parameters.
Parameters:
data(pd.DataFrame): OHLCV market dataentries(np.ndarray): Boolean array of entry signalsexits(np.ndarray): Boolean array of exit signalsparams(Dict): Strategy parameters (must include 'position_size')
Returns:
- Dict[str, Any]: Backtest results with metrics and score
Result Structure:
{
'score': float, # 100*Sharpe + Sortino + Calmar
'metrics': {
'total_return': float, # Total return (%)
'annual_return': float, # Annualized return (%)
'sharpe_ratio': float, # Risk-adjusted return
'sortino_ratio': float, # Downside risk-adjusted
'calmar_ratio': float, # Return/max drawdown
'max_drawdown': float, # Maximum drawdown (%)
'volatility': float, # Annual volatility (%)
'n_trades': int, # Number of trades
'win_rate': float, # Win rate (0-1)
'profit_factor': float # Gross profit / gross loss
},
'params': Dict, # Parameters used
'trades': pd.DataFrame # Trade log (if available)
}Example:
# Generate signals
ma_fast = data['Close'].rolling(10).mean()
ma_slow = data['Close'].rolling(30).mean()
entries = (ma_fast > ma_slow) & (ma_fast.shift(1) <= ma_slow.shift(1))
exits = (ma_fast < ma_slow) & (ma_fast.shift(1) >= ma_slow.shift(1))
# Run backtest
params = {'position_size': 0.1} # 10% of capital per trade
result = engine.run(data, entries.values, exits.values, params)
print(f"Score: {result['score']:.2f}")
print(f"Sharpe Ratio: {result['metrics']['sharpe_ratio']:.2f}")
print(f"Total Return: {result['metrics']['total_return']:.2%}")- Speed: 25,000-63,000 bars/second (after JIT compilation)
- Memory: ~7.5 MB for 10,000 bars
- Scalability: Near-linear scaling with dataset size
- First Run: 15-20s overhead for Numba JIT compilation
Module: mega_backtestAI1h.optimization.sobol_optimizer
Quasi-random sampling optimizer using Sobol sequences for efficient parameter space exploration.
class SobolOptimizer:
"""Sobol sequence optimizer for deterministic quasi-random sampling."""
def __init__(
self,
param_space: Dict[str, Tuple[float, float]],
objective_fn: Callable,
maximize: bool = True,
seed: int = 42
):
"""Initialize Sobol optimizer.
Args:
param_space: Dict mapping parameter names to (min, max) tuples
objective_fn: Function to optimize: fn(params) -> score
maximize: True to maximize, False to minimize
seed: Random seed for reproducibility
"""Run Sobol sequence optimization.
Parameters:
n_trials(int): Number of trials to run
Returns:
- List[Dict]: List of trial results with scores and parameters
Example:
def objective(params):
# Your strategy evaluation
result = backtest_strategy(data, params)
return result['score']
param_space = {
'position_size': (0.05, 0.2),
'ma_fast': (5, 20),
'ma_slow': (20, 50)
}
optimizer = SobolOptimizer(param_space, objective, maximize=True)
results = optimizer.optimize(n_trials=100)
best_params = optimizer.get_best_params()
best_score = optimizer.get_best_score()
print(f"Best: {best_params} → {best_score:.2f}")Get parameters that achieved best score.
Get best score achieved.
- Speed: ~12 trials/second
- Advantages: Deterministic, excellent space coverage, no hyperparameters
- Use Case: Global optimization, initial parameter search
Module: mega_backtestAI1h.optimization.optuna_optimizer
Bayesian optimization using Tree-structured Parzen Estimator (TPE).
Note: Currently has API compatibility issues with latest Optuna version (see benchmarks).
class OptunaOptimizer:
"""Optuna-based Bayesian optimizer with TPE sampler."""
def __init__(
self,
param_space: Dict[str, Tuple[float, float]],
objective_fn: Callable,
maximize: bool = True,
seed: int = 42
):
"""Initialize Optuna optimizer."""Similar to SobolOptimizer:
optimize(n_trials: int) -> List[Dict]get_best_params() -> Dictget_best_score() -> float
Module: mega_backtestAI1h.optimization.genetic_optimizer
Evolutionary optimization using genetic algorithms.
Note: Parameter name is generations not n_generations (see benchmarks).
class GeneticOptimizer:
"""Genetic algorithm optimizer."""
def __init__(
self,
param_space: Dict[str, Tuple[float, float]],
objective_fn: Callable,
maximize: bool = True,
population_size: int = 50,
generations: int = 20, # Note: 'generations', not 'n_generations'
mutation_rate: float = 0.1,
seed: int = 42
):
"""Initialize genetic optimizer."""optimize() -> List[Dict]- Run evolution (no n_trials parameter)get_best_params() -> Dictget_best_score() -> float
Example:
optimizer = GeneticOptimizer(
param_space,
objective_fn,
population_size=30,
generations=10
)
results = optimizer.optimize() # No n_trials argumentModule: mega_backtestAI1h.optimization.advanced
Find optimal weighting (alpha) between mathematical and LLM scores using cross-validation.
class AlphaCalibrator:
"""Alpha calibration using cross-validation."""
def __init__(
self,
alpha_range: List[float] = [0.5, 0.6, 0.7, 0.8, 0.9],
n_folds: int = 3,
metric: str = 'sharpe'
):
"""Initialize alpha calibrator.
Args:
alpha_range: List of alpha values to test
n_folds: Number of CV folds
metric: Metric to optimize ('sharpe', 'sortino', 'calmar')
"""Calibrate optimal alpha using cross-validation.
Parameters:
candidates(List[Dict]): List of strategy candidates with scores
Candidate Structure:
{
'score': float, # Mathematical score
'llm_score': float, # LLM evaluation score (0-100)
'sharpe': float, # Sharpe ratio
'sortino': float, # Sortino ratio
'calmar': float, # Calmar ratio
'is_valid': bool, # Validation status
'params': Dict # Strategy parameters
}Returns:
{
'optimal_alpha': float, # Best alpha value
'best_score': float, # Best out-of-sample score
'cv_results': List[Dict], # Results for each alpha
'n_candidates': int,
'n_folds': int,
'metric': str
}Example:
calibrator = AlphaCalibrator(n_folds=5)
result = calibrator.calibrate(candidates)
optimal_alpha = result['optimal_alpha']
print(f"Optimal alpha: {optimal_alpha}")
# Use optimal alpha for reranking
from mega_backtestAI1h.utils.scoring import combine_scores
combined = combine_scores(
original_score=math_score,
llm_score=llm_score,
alpha=optimal_alpha,
is_valid=True
)- Speed: ~18ms for 50 candidates
- Memory: Minimal (< 1 MB)
Module: mega_backtestAI1h.optimization.advanced
Detect temporal overfitting through chronological out-of-sample testing.
class WalkForwardValidator:
"""Walk-forward validation for temporal robustness."""
def __init__(
self,
n_splits: int = 5,
train_ratio: float = 0.7,
optimization_metric: str = 'sharpe'
):
"""Initialize walk-forward validator.
Args:
n_splits: Number of chronological windows
train_ratio: Fraction of each window for training (0-1)
optimization_metric: Metric to optimize
"""Run walk-forward validation.
Parameters:
data(pd.DataFrame): Time-series data with DatetimeIndexstrategy_func(Callable): Strategy function: fn(data, params) -> metrics_dictparam_space(Dict): Parameter space for optimization
Returns:
{
'enabled': bool, # True if validation ran successfully
'n_splits': int,
'results': List[Dict], # Per-split results
'avg_degradation_pct': float, # Average train-test degradation
'std_degradation_pct': float,
'temporal_consistency': float, # 0-1, higher = more consistent
'overfitting_detected': bool, # True if > 30% degradation
'reason': str # Error message if enabled=False
}Example:
def strategy(data, params):
# Your strategy implementation
result = backtest(data, params)
return {
'sharpe': result['metrics']['sharpe_ratio'],
'sortino': result['metrics']['sortino_ratio'],
'total_return': result['metrics']['total_return'],
'max_drawdown': result['metrics']['max_drawdown'],
'n_trades': result['metrics']['n_trades']
}
validator = WalkForwardValidator(n_splits=5, train_ratio=0.7)
result = validator.validate(data, strategy, param_space)
if result['overfitting_detected']:
print(f"WARNING: Overfitting detected! Degradation: {result['avg_degradation_pct']:.1f}%")
else:
print(f"Temporal consistency: {result['temporal_consistency']:.2f}")- Speed: ~1ms (with mock strategy)
- Use Case: Final validation before deployment
Module: mega_backtestAI1h.optimization.advanced
Test strategy robustness under extreme market conditions.
class AdversarialTester:
"""Adversarial stress testing with 5 scenarios."""
def __init__(self, scenarios: List[Dict] = None):
"""Initialize adversarial tester.
Args:
scenarios: Custom stress scenarios (optional)
Default: Flash Crash, Bear Market, Volatility Spike,
Low Liquidity, Gap Risk
"""Run adversarial stress tests.
Parameters:
data(pd.DataFrame): Market datastrategy_func(Callable): Strategy functionparams(Dict): Strategy parameters
Returns:
{
'enabled': bool,
'baseline': Dict, # Metrics on original data
'scenarios': List[Dict], # Results per scenario
'pass_rate': float, # Fraction of scenarios passed (0-1)
'stress_test_passed': bool # True if pass_rate >= 60%
}Stress Scenarios:
- Flash Crash: -10% intraday spike with recovery
- Bear Market: -30% sustained decline over 6 months
- Volatility Spike: 2x volatility increase
- Low Liquidity: 50% volume reduction
- Gap Risk: Random ±5% overnight gaps
Pass Criteria:
- Sharpe degradation < 50% of baseline
- Max drawdown increase < 20%
- Overall pass rate ≥ 60%
Example:
tester = AdversarialTester()
result = tester.test(data, strategy_func, best_params)
if result['stress_test_passed']:
print(f"✓ Stress test passed! Pass rate: {result['pass_rate']:.1%}")
else:
print(f"✗ Stress test failed. Pass rate: {result['pass_rate']:.1%}")
for scenario in result['scenarios']:
print(f"{scenario['name']}: {'PASS' if scenario.get('passed') else 'FAIL'}")- Speed: ~6ms for 4 scenarios
- Memory: Minimal
Module: mega_backtestAI1h.optimization.advanced
Measure LLM reranking confidence through disagreement analysis.
Calculate confidence metrics from LLM scores.
Parameters:
candidates(List[Dict]): Candidates with 'llm_score' field
Returns:
{
'confidence': float, # 0-1, higher = more confident
'disagreement_score': float, # Coefficient of variation
'score_mean': float,
'score_std': float,
'n_candidates': int
}Formula:
cv = std(llm_scores) / mean(llm_scores)
disagreement = min(1.0, cv)
confidence = 1.0 - disagreementExample:
metrics = ConfidenceMetrics.calculate(candidates)
print(f"LLM Confidence: {metrics['confidence']:.2f}")
if metrics['confidence'] < 0.7:
print("WARNING: Low LLM confidence. Consider manual review.")- Speed: ~0.1ms for 100 candidates (instant)
Module: mega_backtestAI1h.llm.reranker
LLM-based strategy evaluation and reranking using IBM Granite models.
class LLMReranker:
"""LLM-based strategy evaluation."""
def __init__(
self,
model_name: str = "ibm/granite-4.0-3b-instruct",
cache_dir: str = ".llm_cache"
):
"""Initialize LLM reranker."""Rerank strategies using LLM evaluation.
Parameters:
candidates(List[Dict]): Strategy candidates to evaluatecontext(Dict): Market context and evaluation criteria
Returns:
- List[Dict]: Reranked candidates with LLM scores
Example:
reranker = LLMReranker()
context = {
'market_regime': 'trending',
'volatility': 'medium',
'objective': 'risk-adjusted returns'
}
reranked = reranker.rerank(candidates, context)
for i, cand in enumerate(reranked[:5], 1):
print(f"{i}. Score: {cand['combined_score']:.2f}, LLM: {cand['llm_score']:.1f}")Module: mega_backtestAI1h.utils.scoring
Combine mathematical and LLM scores with calibrated alpha.
Formula:
if not is_valid:
return original_score * 0.1 # 90% penalty
llm_normalized = llm_score / 100.0 # Scale to 0-1
combined = alpha * original_score + (1 - alpha) * llm_normalized * max(original_score)
return combinedParameters:
original_score(float): Mathematical score (e.g., 100*Sharpe + Sortino + Calmar)llm_score(float): LLM evaluation score (0-100)alpha(float): Weighting factor (0-1, typically 0.5-0.9)is_valid(bool): Strategy validation status
Returns:
- float: Combined score
Example:
from mega_backtestAI1h.utils.scoring import combine_scores
combined = combine_scores(
original_score=450.0, # Math score
llm_score=85.0, # LLM score
alpha=0.7, # 70% math, 30% LLM
is_valid=True
)
print(f"Combined Score: {combined:.2f}")All components implement comprehensive error handling:
try:
result = engine.run(data, entries, exits, params)
except ValueError as e:
# Invalid input parameters
logger.error(f"Invalid parameters: {e}")
except RuntimeError as e:
# Backtest execution error
logger.error(f"Execution failed: {e}")Common Error Codes:
score = -999: Complete failure (no trades, division by zero, etc.)enabled = False: Component disabled due to insufficient data/candidatesis_valid = False: Strategy failed validation checks
All modules use Python's built-in logging:
import logging
# Set log level
logging.basicConfig(level=logging.INFO)
# Disable debug logs for production
logging.getLogger('mega_backtestAI1h').setLevel(logging.WARNING)Log Levels:
DEBUG: Detailed execution traces (position sizing, iterations)INFO: Key events (optimization complete, best scores)WARNING: Non-critical issues (overfitting detected, low confidence)ERROR: Failures (invalid data, API errors)
All APIs use Python type hints for IDE support:
from typing import Dict, List, Tuple, Callable, Any
import pandas as pd
import numpy as np
def run(
self,
data: pd.DataFrame,
entries: np.ndarray,
exits: np.ndarray,
params: Dict[str, Any]
) -> Dict[str, Any]:
...-
v2.0 (2025-11-02): Modular package with Weeks 1-5 complete
- VectorBT engine migration
- Phase 1-4 optimization pipeline
- Advanced techniques (Alpha Calibration, Walk-Forward, Adversarial)
- Comprehensive test suite (138 tests)
- Performance benchmarks
-
v1.0: Original monolith (
mega_backtestAI1h.py)
Repository: https://github.com/Ricko12vPL/Quantitative_Trading_System
Branch: feature/monolith-to-modular-migration
🤖 Generated with Claude Code
Co-Authored-By: Claude noreply@anthropic.com