Skip to content

Commit 62d90ce

Browse files
Merge pull request #10 from anand-testcompare/feature/include-zed-settings
feature/include zed settings + move logging to mlflow + make pretty
2 parents 8dabe32 + bf4b627 commit 62d90ce

26 files changed

+1583
-602
lines changed

.env.example

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
# OpenRouter Configuration (when DSPY_PROVIDER=openrouter)
88
# ============================================================================
99
OPENROUTER_API_KEY=your_openrouter_api_key_here
10-
# DSPY_MODEL_NAME=nvidia/nemotron-3-nano-30b-a3b:free
10+
DSPY_MODEL_NAME=nvidia/nemotron-3-nano-30b-a3b:free
1111

1212
# Optional headers for OpenRouter
1313
# DSPY_HTTP_HEADERS={"HTTP-Referer":"https://yourdomain.com","X-Title":"My App"}
@@ -21,9 +21,17 @@ OPENROUTER_API_KEY=your_openrouter_api_key_here
2121
# DSPY_MODEL_NAME=local-model
2222

2323
# ============================================================================
24-
# Logging / Observability
24+
# Training Options
25+
# ============================================================================
26+
# DSPY_RUN_ID=run-001 # auto-generated if not set
27+
28+
# ============================================================================
29+
# Artifact Options
2530
# ============================================================================
26-
# DSPY_LOG_LEVEL=INFO
27-
# DSPY_LOG_FORMAT=pretty
28-
# DSPY_RUN_ID=run-001
2931
# DSPY_ARTIFACT_AUTO_UPDATE=false
32+
33+
# ============================================================================
34+
# MLflow (experiment tracking)
35+
# ============================================================================
36+
# MLflow uses SQLite by default: mlflow/mlflow.db
37+
# No additional configuration required - runs automatically during training

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,10 @@ cython_debug/
207207
# DSPy cache
208208
data/.dspy_cache/
209209

210+
# MLflow experiment tracking
211+
mlflow/
212+
mlruns/
213+
210214
# Playwright browser testing artifacts
211215
.playwright-mcp/
212216

.zed/settings.json

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
{
2+
"file_scan_exclusions": [
3+
"**/.git",
4+
// "**/.venv",
5+
"**/venv",
6+
"**/__pycache__",
7+
"**/.mypy_cache",
8+
"**/node_modules",
9+
"**/.ruff_cache",
10+
"**/.pytest_cache",
11+
"**/dspy_reference_examples.egg-info",
12+
"**/.dspy_cache"
13+
]
14+
}

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2025 Anand Pant
3+
Copyright (c) 2026 Anand Pant | shpit.dev/contact
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 84 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# DSPy Reference Examples
22

33
[![Python](https://img.shields.io/badge/Python-3.13+-3776AB?logo=python&logoColor=white)](https://docs.python.org/3/)
4-
[![DSPy](https://img.shields.io/badge/DSPy-3.1.0-1F2937)](https://dspy.ai/)
5-
[![LiteLLM](https://img.shields.io/badge/LiteLLM-1.72.6-00A67E)](https://docs.litellm.ai/)
4+
[![DSPy](https://img.shields.io/badge/DSPy-3.1.0-DC2626?logo=dspy&logoColor=white)](https://dspy.ai/)
5+
[![MLflow](https://img.shields.io/badge/MLflow-3.8.1-0194E2?logo=mlflow&logoColor=white)](https://mlflow.org/)
66
[![FastAPI](https://img.shields.io/badge/FastAPI-0.128.0-009688?logo=fastapi&logoColor=white)](https://fastapi.tiangolo.com/)
77
[![Pydantic](https://img.shields.io/badge/Pydantic-2.12.5-E92063?logo=pydantic&logoColor=white)](https://docs.pydantic.dev/latest/)
88
[![Ruff](https://img.shields.io/badge/Ruff-0.14.10-FCC21B?logo=ruff&logoColor=000000)](https://docs.astral.sh/ruff/)
@@ -12,6 +12,8 @@
1212
[![Nemotron-3-Nano-30B](https://img.shields.io/badge/Nemotron--3--Nano--30B-HF_(unsloth_quantized)-76B900?logo=nvidia&logoColor=white)](https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF)
1313
[![GPT-OSS-120B](https://img.shields.io/badge/openai/gpt--oss--120b-OpenRouter_(Cerebras)-F76B1C?logo=openai&logoColor=white)](https://openrouter.ai/openai/gpt-oss-120b?sort=throughput)
1414

15+
![DSPy Reference Examples](assets/dspy-video-thumbnail.png)
16+
1517
Real-world DSPy workflows for pharma/medtech teams. This project provides a flexible multi-classification system for
1618
Ozempic-related text analysis. Currently supports three classification tasks:
1719

@@ -23,6 +25,7 @@ The framework shows how to:
2325

2426
- Programmatically optimize prompts with DSPy
2527
- Support multiple classification tasks with dynamic signatures
28+
- Track experiments with MLflow (SQLite backend for easy querying)
2629
- Persist tuned artifacts to disk (separate from source)
2730
- Serve classifiers via FastAPI with typed Pydantic contracts
2831

@@ -46,9 +49,7 @@ The framework shows how to:
4649
| `DSPY_LOCAL_BASE` | Base URL for local provider | `http://localhost:8080/v1` |
4750
| `DSPY_HTTP_HEADERS` | JSON blob for extra HTTP headers | `{}` |
4851
| `OPENROUTER_HTTP_REFERER`, `OPENROUTER_APP_TITLE` | OpenRouter analytics headers ||
49-
| `DSPY_LOG_LEVEL` | Log verbosity | `INFO` |
50-
| `DSPY_LOG_FORMAT` | Log format (`json` or `text`) | `json` |
51-
| `DSPY_RUN_ID` | Correlation run id | auto-generated |
52+
| `DSPY_RUN_ID` | Training run identifier | auto-generated |
5253
| `DSPY_ARTIFACT_AUTO_UPDATE` | Auto-update artifact model metadata on load | `false` |
5354

5455
Copy `.env.example` and fill in whichever keys you need:
@@ -93,6 +94,9 @@ This creates a clean layout:
9394
│ └── pc-category-classification/ # PC category classification
9495
│ ├── train.json
9596
│ └── test.json
97+
├── mlflow/ # MLflow experiment tracking (auto-created)
98+
│ ├── mlflow.db # SQLite database for runs/metrics
99+
│ └── artifacts/ # Logged artifacts
96100
├── scripts/
97101
│ ├── datagen/ # Data generation scripts
98102
│ └── deploy/ # Deployment scripts
@@ -149,6 +153,28 @@ uv run python -m src.pipeline.main --classification-type ae-category
149153
uv run python -m src.pipeline.main --classification-type pc-category
150154
```
151155

156+
### CLI Options
157+
158+
| Flag | Short | Description |
159+
|------|-------|-------------|
160+
| `--classification-type` | `-t` | Classification type: `ae-pc`, `ae-category`, `pc-category` (default: `ae-pc`) |
161+
| `--verbose` | `-v` | Show detailed output (per-example evaluation, MIPROv2 progress) |
162+
| `--inspect` | `-i` | Show DSPy prompts/responses after optimization completes |
163+
164+
```bash
165+
# Quiet output (default) - just key progress messages
166+
uv run python -m src.pipeline.main -t ae-pc
167+
168+
# Verbose - see evaluation details and optimizer progress
169+
uv run python -m src.pipeline.main -t ae-pc --verbose
170+
171+
# Inspect prompts after training
172+
uv run python -m src.pipeline.main -t ae-pc --inspect
173+
174+
# Both verbose and inspect
175+
uv run python -m src.pipeline.main -t ae-pc -v -i
176+
```
177+
152178
The run will:
153179

154180
1. Configure DSPy with your provider settings.
@@ -157,7 +183,49 @@ The run will:
157183
4. Optimize via `MIPROv2` (with `auto="medium"`).
158184
5. Evaluate the optimized program.
159185
6. Write the artifact to `artifacts/ozempic_classifier_<type>_optimized.json`.
186+
7. Log params, metrics, and artifacts to MLflow (`mlflow/mlflow.db`).
187+
188+
### Experiment Tracking with MLflow
189+
190+
Training runs are automatically tracked in a local SQLite database. Query your experiments:
191+
192+
#### List all runs with metrics
193+
194+
```bash
195+
sqlite3 mlflow/mlflow.db "
196+
SELECT
197+
e.name as experiment,
198+
r.name as run_name,
199+
r.status,
200+
m.key,
201+
m.value
202+
FROM runs r
203+
JOIN experiments e ON r.experiment_id = e.experiment_id
204+
LEFT JOIN metrics m ON r.run_uuid = m.run_uuid
205+
ORDER BY r.start_time DESC;
206+
"
207+
```
208+
209+
#### Compare baseline vs optimized accuracy across runs
210+
```bash
211+
sqlite3 mlflow/mlflow.db "
212+
SELECT
213+
r.name,
214+
MAX(CASE WHEN m.key = 'baseline_accuracy' THEN m.value END) as baseline,
215+
MAX(CASE WHEN m.key = 'optimized_accuracy' THEN m.value END) as optimized,
216+
MAX(CASE WHEN m.key = 'improvement' THEN m.value END) as improvement
217+
FROM runs r
218+
JOIN metrics m ON r.run_uuid = m.run_uuid
219+
GROUP BY r.run_uuid
220+
ORDER BY r.start_time DESC;
221+
"
222+
```
160223

224+
Or launch the MLflow UI:
225+
226+
```bash
227+
mlflow ui --backend-store-uri sqlite:///mlflow/mlflow.db
228+
```
161229

162230
---
163231

@@ -247,8 +315,11 @@ tenant or use-case.
247315

248316
## Demo Script
249317

250-
`uv run python inference_demo.py` executes a small batch of complaints through the shared interface and prints
251-
latency/throughput stats. Useful for quick smoke tests after retraining.
318+
```bash
319+
uv run python inference_demo.py
320+
```
321+
322+
Runs a few sample complaints through the classifier and shows the full DSPy prompt/response for each using `dspy.inspect_history()`. Useful for demos and understanding how DSPy translates to actual LLM requests.
252323

253324
---
254325

@@ -331,3 +402,9 @@ export DSPY_MODEL_NAME=local-model
331402
## License
332403

333404
MIT – see `LICENSE` for details.
405+
406+
---
407+
408+
## Author
409+
410+
Created by [Anand Pant](https://github.com/anand-testcompare)

artifacts/ozempic_classifier_ae-category_optimized.json

Lines changed: 4 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -2,32 +2,9 @@
22
"classify.predict": {
33
"traces": [],
44
"train": [],
5-
"demos": [
6-
{
7-
"augmented": true,
8-
"complaint": "I've had recurrent attacks of gallbladder pain. It feels like a vice grip under my ribs. An ultrasound showed my gallbladder is full of sludge and small stones. The gastroenterologist says the medication caused the rapid weight loss which led to the stones. I'm scheduled for a laparoscopic cholecystectomy next month to remove the gallbladder.",
9-
"reasoning": "The patient reports recurrent gallbladder pain, ultrasound evidence of sludge and small stones in the gallbladder, and a physician linking the rapid weight loss (likely from medication) to stone formation, culminating in scheduled surgical removal of the gallbladder. These findings pertain directly to pathology of the gallbladder.",
10-
"classification": "Hepatobiliary (gallbladder) disease",
11-
"justification": "The complaint describes gallbladder pain, ultrasound findings of sludge/stones, medication\u2011related rapid weight loss leading to stones, and surgical intervention\u2014all characteristic of gallbladder disease."
12-
},
13-
{
14-
"augmented": true,
15-
"complaint": "My heart feels like it's racing all the time. My resting pulse is over 100 beats per minute. I feel jittery and anxious, like I've had too much caffeine. My doctor did an EKG and confirmed sinus tachycardia. He thinks it's a side effect of the Ozempic stimulating my heart.",
16-
"reasoning": "The patient describes a racing heart, elevated resting pulse (>100 bpm), jitteriness, and confirmed sinus tachycardia, which the doctor attributes to a side effect of Ozempic stimulating the heart. These symptoms pertain to cardiac function and are directly related to the medication\u2019s cardiovascular effects.",
17-
"classification": "Cardiovascular signs",
18-
"justification": "The complaint involves tachycardia and related cardiac symptoms, which fall under the \u201cCardiovascular signs\u201d category."
19-
},
20-
{
21-
"complaint": "My retina is bleeding. I see red streaks in my vision. The doctor says I have proliferative diabetic eye disease. It got worse after I started Ozempic. I need urgent laser surgery to stop the bleeding.",
22-
"classification": "Eye disorders (Diabetic retinopathy complications)"
23-
},
24-
{
25-
"complaint": "The nausea is bad. I can't eat. I'm losing weight. I feel sick. I want to stop the drug. It's too much.",
26-
"classification": "Gastrointestinal disorders"
27-
}
28-
],
5+
"demos": [],
296
"signature": {
30-
"instructions": "Given a free\u2011text complaint describing a medication\u2011related adverse event, generate a step\u2011by\u2011step reasoning trace, then output:\n\n1. **Reasoning** \u2013 a brief chain\u2011of\u2011thought explaining how the clinical details point to a specific adverse event type. \n2. **Classification** \u2013 a label selected from the predefined medical categories (e.g., \u201cPancreatitis\u201d, \u201cEye disorders (Diabetic retinopathy complications)\u201d, \u201cHepatobiliary (gallbladder) disease\u201d, etc.). \n3. **Justification** \u2013 a concise explanation linking the reasoning steps to the chosen classification.\n\nThe output should be presented in the order: reasoning, classification, justification, and must be ready for use as a prompt for a language model performing this classification task.",
7+
"instructions": "Classify adverse events into specific categories.",
318
"fields": [
329
{
3310
"prefix": "Complaint:",
@@ -55,11 +32,9 @@
5532
"dspy": "3.1.0",
5633
"cloudpickle": "3.1"
5734
},
58-
"model": "Nemotron-3-Nano-30B-A3B-UD-Q3_K_XL.gguf",
5935
"classification_type": "ae-category",
6036
"classification_config": {
6137
"description": "Classify adverse events into specific categories.",
62-
"output_desc": "One of: Gastrointestinal disorders, Pancreatitis, Hepatobiliary (gallbladder) disease, Hypoglycemia, Eye disorders (Diabetic retinopathy complications), Renal events (Acute kidney injury), Hypersensitivity, Injection-site reactions, Cardiovascular signs, Peri-procedural aspiration risk, Gastrointestinal disorders (Gastroparesis)",
6338
"labels": [
6439
"Gastrointestinal disorders",
6540
"Pancreatitis",
@@ -73,6 +48,7 @@
7348
"Peri-procedural aspiration risk",
7449
"Gastrointestinal disorders (Gastroparesis)"
7550
]
76-
}
51+
},
52+
"mlflow_run_id": "ff75ead1"
7753
}
7854
}

artifacts/ozempic_classifier_ae-pc_optimized.json

Lines changed: 5 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -2,35 +2,9 @@
22
"classify.predict": {
33
"traces": [],
44
"train": [],
5-
"demos": [
6-
{
7-
"augmented": true,
8-
"complaint": "I was doing my usual check of the pen before using it and noticed that the lot number printed on the actual pen body is really smudged, like the ink was smeared or something. I can maybe make out a couple digits but most of it is completely unreadable. And then I looked for the expiration date on the device itself and it's just not there at all. Like, it's missing entirely from the pen body. The outer carton has all the information - lot number, expiration date, everything is printed clearly on the box - but the device itself doesn't have readable traceability information. My understanding is that this information is supposed to be on both the carton and the device for safety reasons, so you can always verify what you're using even if you throw away the box. I need to be able to track which specific pen I'm using in case there's ever a recall or quality issue, and right now I can't do that because the pen itself can't be verified or traced. I contacted the pharmacy to let them know, and they're documenting this issue. They confirmed the lot number from the carton for my records, but this seems like a labeling defect that should be reported.",
9-
"reasoning": "The patient reports a labeling defect on the medication delivery device: the lot number is smudged and the expiration date is missing from the pen body itself, making it impossible to verify product identity for safety and recall purposes. This is a concern about the physical presentation and traceability of the product, not a description of a medical symptom or physiological reaction to semaglutide. Therefore, it reflects an issue with the product's labeling rather than an adverse health reaction.",
10-
"classification": "Product Complaint",
11-
"justification": "The complaint pertains to a labeling defect (unreadable lot number and missing expiration date on the device) that impacts product traceability and safety verification, which is characteristic of a product complaint rather than an adverse medical event."
12-
},
13-
{
14-
"augmented": true,
15-
"complaint": "When I press the button to inject, the plunger doesn't move forward. I can hear the clicking sound, but the dose counter stays at 0.5 mg and no liquid comes out of the needle. The internal mechanism seems to be disconnected.",
16-
"reasoning": "The complaint describes a malfunction of the injection device (plunger not moving, no liquid expelled), which pertains to the product's mechanical function rather than a physiological side effect.",
17-
"classification": "Product Complaint",
18-
"justification": "The issue is a device defect, not a reported adverse medical reaction to semaglutide; it concerns the injector mechanism itself."
19-
},
20-
{
21-
"augmented": true,
22-
"complaint": "My heart feels like it's doing flip-flops in my chest. I get these sudden episodes where my pulse spikes to 130 and I feel lightheaded and short of breath. It happens when I'm just sitting at my desk. It makes me feel incredibly anxious. My cardiologist ruled out other causes and believes it's a side effect of the medication increasing my sympathetic nervous system tone.",
23-
"reasoning": "The patient describes specific cardiac symptoms (palpitations, tachycardia, lightheadedness, shortness of breath) that began after dose escalation of semaglutide and are attributed by the cardiologist to a medication\u2011induced increase in sympathetic nervous system activity. These are physiological adverse reactions directly linked to the drug, rather than a general dissatisfaction with the product.",
24-
"classification": "Adverse Event",
25-
"justification": "The complaint details a clinically observed, drug\u2011related side effect (cardiovascular symptoms) that aligns with known adverse events of semaglutide, indicating an adverse event rather than a mere product complaint."
26-
},
27-
{
28-
"complaint": "I injected the full dose but I felt nothing. Usually it stings a little. My blood sugar didn't change at all. I wonder if it's just saline in the pen.",
29-
"classification": "Product Complaint"
30-
}
31-
],
5+
"demos": [],
326
"signature": {
33-
"instructions": "Classify an Ozempic\u2011related complaint as either **Adverse Event** or **Product Complaint**. \n- First, use chain\u2011of\u2011thought reasoning to determine whether the complaint describes a harmful medical reaction (e.g., side effects, injury, hospitalization) \u2013 which should be labeled *Adverse Event* \u2013 or a problem with the product\u2019s performance, potency, packaging, or suspected defect \u2013 which should be labeled *Product Complaint*. \n- Then output two fields: \n 1. **classification** \u2013 the chosen label exactly as \u201cAdverse Event\u201d or \u201cProduct Complaint\u201d. \n 2. **justification** \u2013 a concise explanation (1\u20112 sentences) summarizing why the complaint fits that category. \n\nRespond only with the two fields above.",
7+
"instructions": "Classify Ozempic-related complaints as Adverse Event or Product Complaint.",
348
"fields": [
359
{
3610
"prefix": "Complaint:",
@@ -42,7 +16,7 @@
4216
},
4317
{
4418
"prefix": "Classification:",
45-
"description": "Either 'Adverse Event' or 'Product Complaint'"
19+
"description": "One of: Adverse Event, Product Complaint"
4620
},
4721
{
4822
"prefix": "Justification:",
@@ -58,15 +32,14 @@
5832
"dspy": "3.1.0",
5933
"cloudpickle": "3.1"
6034
},
61-
"model": "Nemotron-3-Nano-30B-A3B-UD-Q3_K_XL.gguf",
6235
"classification_type": "ae-pc",
6336
"classification_config": {
6437
"description": "Classify Ozempic-related complaints as Adverse Event or Product Complaint.",
65-
"output_desc": "Either 'Adverse Event' or 'Product Complaint'",
6638
"labels": [
6739
"Adverse Event",
6840
"Product Complaint"
6941
]
70-
}
42+
},
43+
"mlflow_run_id": "8a4106d1"
7144
}
7245
}

0 commit comments

Comments
 (0)