Based on analysis of Python DSPy vs current Desiru implementation.
- Generates executable code instead of natural language
- Critical for math/logic problems requiring computation
- Uses code execution environment
- Runs multiple ChainOfThought instances
- Compares and selects best reasoning path
- Useful for complex reasoning tasks
- Samples N outputs from any module
- Selects best based on metric/scoring
- Simple but effective ensemble technique
- Iterative refinement of outputs
- Takes initial output and improves it
- Works with constraints and feedback
- ChainOfThought variant with guided hints
- Provides additional context for reasoning
- Better control over reasoning direction
- Most advanced DSPy optimizer
- Uses Bayesian optimization
- Optimizes both instructions and demonstrations
- Significantly better than BootstrapFewShot
- Coordinates multiple optimization strategies
- Collaborative approach to prompt engineering
- Handles complex multi-module programs
- Enhanced version of BootstrapFewShot
- Adds hyperparameter random search
- Better exploration of optimization space
- Simple optimizer using provided examples
- No bootstrapping, just uses given labels
- Good baseline optimizer
- K-nearest neighbor example selection
- Dynamic example selection based on input
- Better than static few-shot examples
- Generates training data for model finetuning
- Alternative to prompt optimization
- For when you can modify the model
- Combines multiple optimized programs
- Voting or weighted combination
- Improved robustness
- Optimizes signature descriptions themselves
- Rewrites field descriptions for clarity
- Meta-optimization approach
- Bayesian approach to signature optimization
- More sophisticated than SignatureOptimizer
- Better exploration of description space
- Special data containers with utilities
- Flexible field access (dot notation)
- Completion tracking for Predictions
- Integration with trace system
- Type-safe field handling
- Pydantic integration in Python
- Automatic validation and parsing
- Better IDE support
- Unlike Assertions (hard constraints)
- Guide optimization without failing
- Used during compilation phase
- Detailed execution tracking
- Records all LLM calls and transformations
- Critical for optimization
- Enables debugging and analysis
- Full compilation pipeline
- Trace filtering and selection
- Demonstration ranking
- Parameter update mechanism
- Some optimizers generate custom instructions
- Not just examples but rewritten prompts
- Adaptive to task requirements
- HuggingFace dataset integration
- CSV/JSON loaders with DSPy formatting
- Train/dev/test split utilities
- Batch processing support
- Unified interface for multiple providers
- Beyond just OpenAI (Anthropic, Cohere, etc.)
- Local model support (Ollama, etc.)
- Token counting and cost tracking
- F1, BLEU, ROUGE scores
- LLM-as-Judge implementations
- Composite metric builders
- Batch evaluation utilities
- Token-by-token streaming
- Progressive output display
- Useful for long generations
- Save/load compiled programs
- Export optimized parameters
- Model versioning support
- Global configuration system
- Provider-specific settings
- Experiment tracking
- Request deduplication
- Semantic caching options
- Cache invalidation strategies
- Batch processing optimizations
- Concurrent module execution
- Async compilation runs
- Advanced retrieval model
- Better than basic vector search
- Optimized for retrieval tasks
- Detailed trace visualization
- Cost tracking and reporting
- Performance profiling
- Example/Prediction classes
- Trace collection system
- MIPROv2 optimizer
- ProgramOfThought module
- Compilation infrastructure
- MultiChainComparison
- BestOfN module
- Typed predictors
- Additional optimizers (COPRO, KNNFewShot)
- Data loaders
- Advanced metrics
- Streaming support
- ColBERTv2 integration
- Ensemble optimizer
- Signature optimizers