Skip to content

meghac538/Dr_Zero_Base_Model

Repository files navigation

Transcript Dr. Zero

Self-evolving search agent for transcript analysis using Dr. Zero framework, optimized for efficiency.

Features

  • DPO Training: 8x faster than GRPO
  • LoRA Fine-tuning: 4x less memory
  • LLM Validation: Intelligent answer quality assessment
  • Curriculum Learning: Adaptive question generation
  • Early Stopping: Prevents overfitting
  • Synthetic Data: Test without real transcripts

Quick Start

# Setup
pip install -r requirements.txt

# Generate test data
python generate_synthetic_transcripts.py

# Build corpus
python build_corpus.py

# Run training (requires GPU)
python self_evolution.py

Requirements

  • Python 3.8+
  • 2x A100 GPUs (or equivalent)
  • 160GB GPU memory
  • ~200GB disk space

Project Structure

transcript_drzero/
├── build_corpus.py              # PDF extraction and indexing
├── retriever.py                 # Semantic search
├── proposer.py                  # Question generation
├── solver.py                    # Answer generation
├── llm_validator.py             # Answer validation
├── generate_preferences.py      # Training data creation
├── train_dpo.py                 # DPO training
├── self_evolution.py            # Main training loop
├── generate_synthetic_transcripts.py  # Test data
└── config.yaml                  # Configuration

Configuration

Edit config.yaml to customize:

  • Model size and LoRA settings
  • Training hyperparameters
  • Question generation ratios
  • Evaluation questions

Training

# Full 3-iteration training with early stopping
python self_evolution.py

Expected:

  • Time: 12-24 hours
  • Cost: $150-210 (AWS 2x A100)
  • Performance: 75-85% accuracy on hard questions

Usage

from solver import SolverAgent

solver = SolverAgent(model_name="./models/iter3/solver")
result = solver.solve("What is the student's GPA?")
print(result["final_answer"])

Performance

After 3 iterations:

  • Easy questions: 90-95%
  • Medium questions: 80-90%
  • Hard questions: 70-85%

Comparison to Original Dr. Zero

Aspect Original This Implementation
Training GRPO DPO (8x faster)
Model 70B 7B
GPUs 8x A100 2x A100
Time/iter 48h 6-8h
Cost/iter $1,500 $200
Memory 640GB 160GB

License

Non-commercial use only (following Dr. Zero license)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages