CoReason-AI · gowthamrao · Jan 23, 2026 · Jan 19, 2026 · Jan 19, 2026 · Jan 20, 2026
diff --git a/.github/workflows/ci-cd.yml b/.github/workflows/ci-cd.yml
@@ -34,7 +34,7 @@ jobs:
     strategy:
       matrix:
         os: [ubuntu-latest, windows-latest, macos-latest]
-        python-version: ["3.12", "3.13", "3.14"]
+        python-version: ["3.12", "3.13"]
     steps:
       - uses: actions/checkout@ff7abcd0c3c05ccf6adc123a8cd1fd4fb30fb493
       - name: Set up Python ${{ matrix.python-version }}

diff --git a/.gitignore b/.gitignore
@@ -146,3 +146,4 @@ cython_debug/
 
 # Runtime Logs
 logs/
+optimized_manifest.json
diff --git a/README.md b/README.md
@@ -1,42 +1,69 @@
 # coreason-optimizer
 
-coreason-optimizer
-
-[![CI/CD](https://github.com/CoReason-AI/coreason_optimizer/actions/workflows/ci-cd.yml/badge.svg)](https://github.com/CoReason-AI/coreason_optimizer/actions/workflows/ci-cd.yml)
-[![PyPI](https://img.shields.io/pypi/v/coreason_optimizer.svg)](https://pypi.org/project/coreason_optimizer/)
-[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/coreason_optimizer.svg)](https://pypi.org/project/coreason_optimizer/)
-[![License](https://img.shields.io/github/license/CoReason-AI/coreason_optimizer)](https://github.com/CoReason-AI/coreason_optimizer/blob/main/LICENSE)
-[![Codecov](https://codecov.io/gh/CoReason-AI/coreason_optimizer/branch/main/graph/badge.svg)](https://codecov.io/gh/CoReason-AI/coreason_optimizer)
-[![Downloads](https://static.pepy.tech/badge/coreason_optimizer)](https://pepy.tech/project/coreason_optimizer)
-[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
-[![Pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)
-
-## Getting Started
-
-### Prerequisites
-
-- Python 3.12+
-- Poetry
-
-### Installation
-
-1.  Clone the repository:
-    ```sh
-    git clone https://github.com/CoReason-AI/coreason_optimizer.git
-    cd coreason_optimizer
-    ```
-2.  Install dependencies:
-    ```sh
-    poetry install
-    ```
-
-### Usage
-
--   Run the linter:
-    ```sh
-    poetry run pre-commit run --all-files
-    ```
--   Run the tests:
-    ```sh
-    poetry run pytest
-    ```
+**Automated Prompt Engineering / LLM Compilation / DSPy Integration for CoReason-AI**
+
+[![License: Prosperity 3.0](https://img.shields.io/badge/license-Prosperity%203.0-blue)](https://prosperitylicense.com/versions/3.0.0)
+[![CI Status](https://github.com/CoReason-AI/coreason-optimizer/actions/workflows/main.yml/badge.svg)](https://github.com/CoReason-AI/coreason-optimizer/actions)
+[![Code Style: Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
+[![Documentation](https://img.shields.io/badge/docs-product_requirements-blue)](docs/product_requirements.md)
+
+**coreason-optimizer** is the "Compiler" for the CoReason Agentic Platform. It automates prompt engineering by treating prompts as trainable weights, optimizing them against ground-truth datasets to maximize performance metrics.
+
+---
+
+## Installation
+
+```bash
+pip install coreason-optimizer
+```
+
+## Features
+
+-   **Automated Optimization:** Rewrites instructions and selects examples to maximize a score, not human intuition.
+-   **Model-Specific Compilation:** Generates optimized prompts specifically tuned for target models (e.g., GPT-4, Claude 3.5).
+-   **Continuous Learning:** Re-runs optimization on recent logs to patch prompts against data drift.
+-   **Mutate-Evaluate Loop:** Systematic cycle of drafting, evaluating, diagnosing, mutating, and selecting prompts.
+-   **Strategies:** Includes BootstrapFewShot (mining successful traces) and MIPRO (Multi-prompt Instruction PRoposal Optimizer).
+-   **Integration:** Works seamlessly with `coreason-construct`, `coreason-archive`, and `coreason-assay`.
+
+For full product requirements, see [docs/product_requirements.md](docs/product_requirements.md).
+
+## Usage
+
+Here is how to initialize and use the library to compile an agent:
+
+```python
+from coreason_optimizer import OptimizerConfig, PromptOptimizer
+from coreason_optimizer.core.interfaces import Construct
+from coreason_optimizer.data import Dataset
+
+# 1. Configuration
+config = OptimizerConfig(
+    target_model="gpt-4o",
+    metric="exact_match",
+    max_rounds=10
+)
+
+# 2. Load Data
+dataset = Dataset.from_csv("data/gold_set.csv")
+train_set, val_set = dataset.split(test_size=0.2)
+
+# 3. Load Agent (Construct)
+# In a real scenario, this would be imported from your agent code
+# from src.agents.analyst import analyst_agent
+class MockAgent(Construct):
+    inputs = ["question"]
+    outputs = ["answer"]
+    system_prompt = "You are a helpful assistant."
+agent = MockAgent()
+
+# 4. Compile
+optimizer = PromptOptimizer(config=config)
+optimized_manifest = optimizer.compile(
+    agent=agent,
+    trainset=train_set,
+    valset=val_set
+)
+
+print(f"Optimization complete. New Score: {optimized_manifest.performance_metric}")
+print(f"Optimized Instruction: {optimized_manifest.optimized_instruction}")
diff --git a/VIGNETTE.md b/VIGNETTE.md
@@ -0,0 +1,69 @@
+# The Architecture and Utility of coreason-optimizer
+
+## 1. The Philosophy (The Why)
+
+The prevailing method of interacting with Large Language Models (LLMs)—manual "prompt engineering"—is an exercise in frustration. It is artisan work: fragile, unscalable, and often relying on "magic words" that break when models update. The author of `coreason-optimizer` recognizes that prompts are not merely text; they are **trainable parameters** of a software system.
+
+This package exists to replace intuition with optimization. Instead of a developer guessing which few-shot examples might help, `coreason-optimizer` empirically selects them. Instead of rewriting instructions hoping for better JSON compliance, it uses a meta-learner to rewrite them for you. It shifts the paradigm from "Prompt Whisperer" to "Prompt Compiler," treating the agent definition as source code and the deployed prompt as a compiled, frozen binary.
+
+## 2. Under the Hood (The Dependencies & Logic)
+
+The engine runs on a focused stack designed for iterative evaluation:
+
+*   **Pydantic** enforces the rigorous schema definitions (`OptimizerConfig`, `OptimizedManifest`) required for a compiler that must output deterministic artifacts.
+*   **OpenAI** & **Numpy/Scikit-Learn** power the semantic search and generation capabilities. The package doesn't just call LLMs; it uses embeddings to find "nearest neighbor" successful examples to inject into prompts (`SemanticSelector`).
+*   **Loguru** provides the observability backbone. When an optimization run takes 4 hours and spends $10, you need structured, searchable logs to understand *why* a specific mutation was rejected.
+*   **Click** exposes the compiler interface to CI/CD pipelines, allowing optimization to be a step in the build process, not a manual task.
+
+The core logic revolves around the **Mutate-Evaluate Loop**. Inspired by DSPy, the `MiproOptimizer` (Multi-prompt Instruction PRoposal Optimizer) generates candidate instructions using a "Teacher" model. Simultaneously, it selects sets of few-shot examples. It then performs a grid search across these combinations, scoring them against a ground-truth dataset using a defined `Metric` (like `exact_match`). The result is not just a better prompt, but a mathematically optimal one for that specific dataset and model.
+
+## 3. In Practice (The How)
+
+Here is how `coreason-optimizer` transforms a raw agent definition into a deployed artifact.
+
+### Compiling an Agent
+
+The `compile` method is the heart of the system. It takes your agent logic and training data, runs the optimization strategies (like BootstrapFewShot or MIPRO), and returns a frozen manifest.
+
+```python
+from coreason_optimizer.core.config import OptimizerConfig
+from coreason_optimizer.strategies.mipro import MiproOptimizer
+from coreason_optimizer.core.metrics import MetricFactory
+
+# 1. Configuration: Define the target environment
+config = OptimizerConfig(
+    target_model="gpt-4o",
+    budget_limit_usd=5.00,  # Safety first
+    max_rounds=10,
+)
+
+# 2. Instantiate the Optimizer with a specific Metric
+# "exact_match" ensures the output strictly adheres to the reference
+optimizer = MiproOptimizer(
+    llm_client=client, metric=MetricFactory.get("exact_match"), config=config
+)
+
+# 3. The Compilation Step
+# This runs the "Mutate-Evaluate" loop, finding the best instruction/example pair
+manifest = optimizer.compile(
+    agent=my_agent_construct,
+    trainset=training_examples,
+    valset=validation_examples,
+)
+
+print(f"Optimization improved score to: {manifest.performance_metric}")
+```
+
+### The Optimized Artifact
+
+The output is a portable JSON manifest. This file allows the runtime to execute the optimized agent without needing the optimizer or the training data again.
+
+```python
+# The manifest contains the "compiled" prompt logic
+print(manifest.optimized_instruction)
+# > "Extract adverse events from the text. Format as JSON. [Optimized Instructions...]"
+
+# It also holds the mathematically selected few-shot examples
+for example in manifest.few_shot_examples:
+    print(f"Input: {example.inputs} -> Output: {example.reference}")
+```
diff --git a/docs/product_requirements.md b/docs/product_requirements.md
@@ -0,0 +1,129 @@
+# Product Requirements Document: coreason-optimizer
+
+**Domain:** Automated Prompt Engineering / LLM Compilation / DSPy Integration
+**Package Name:** coreason-optimizer
+
+---
+
+## 1. Executive Summary
+
+**coreason-optimizer** is the "Compiler" for the CoReason Agentic Platform.
+
+In the current SOTA (State-of-the-Art), writing static prompts by hand is considered technical debt. **coreason-optimizer** automates this by treating prompts (instructions and few-shot examples) as **trainable weights**. It ingests a "Draft Agent" defined in `coreason-construct` and iterates on it against a ground-truth dataset (validated by `coreason-assay`), mathematically maximizing performance metrics. It outputs a "Frozen Manifest" that is deployed to production, ensuring GxP stability.
+
+## 2. Problem Statement & Rationale
+
+| Problem | Impact | The coreason-optimizer Solution |
+| :---- | :---- | :---- |
+| **The "Prompt Whisperer" Bottleneck** | Engineers spend hours tweaking words ("Please be careful") with unpredictable results. | **Automated Optimization:** A meta-algorithm rewrites instructions and selects examples to maximize a score, not human intuition. |
+| **Brittleness** | A prompt that works for GPT-4 often fails for Claude 3.5 or Llama 3. | **Model-Specific Compilation:** The optimizer can run separate jobs to generate optimized prompts specifically tuned for the target model. |
+| **Drift** | Agents degrade over time as data distributions change (e.g., new medical slang). | **Continuous Learning:** Re-running the optimizer on recent "Gold" logs from `coreason-archive` automatically patches the prompt. |
+
+## 3. Architectural Design
+
+### 3.1 The "Mutate-Evaluate" Loop
+
+The package implements a systematic optimization cycle (inspired by DSPy):
+
+1.  **Draft:** Start with the developer's base intention.
+2.  **Evaluate:** Run the agent on a training set.
+3.  **Diagnose:** Identify failing examples using `coreason-assay` metrics.
+4.  **Mutate:**
+    *   **Bootstrap Few-Shot:** Find historical examples where the agent *succeeded* on similar hard cases and inject them into the prompt.
+    *   **Instruction Induction:** Use a Meta-LLM to rewrite the System Prompt to explicitly address the observed failures.
+5.  **Select:** Keep the mutation that yields the highest metric score.
+
+### 3.2 Integration Map
+
+*   **Input (Schema):** `coreason-construct` defines the Agent structure (Inputs/Outputs).
+*   **Input (Data):** `coreason-archive` provides historical logs to mine for training examples.
+*   **Feedback (Loss Function):** `coreason-assay` provides the scoring function (e.g., accuracy, json_validity, f1_score).
+*   **Output (Artifact):** Produces a versioned `OptimizedManifest.json` used by the runtime.
+
+## 4. Functional Specifications
+
+### 4.1 The Optimization Engine
+
+*   **Strategy: BootstrapFewShot:**
+    *   Automatically mines the "Teacher" model's successful traces to create few-shot examples for the "Student" prompt.
+*   **Strategy: MIPRO (Multi-prompt Instruction PRoposal Optimizer):**
+    *   Generates 10 candidates for the System Instruction and 5 combinations of Few-Shot examples, finding the optimal pair via Bayesian optimization or simple grid search.
+*   **Cost Awareness:**
+    *   Must implement a `BudgetManager` to halt optimization if the token spend exceeds a defined limit (e.g., $10.00).
+
+### 4.2 Data Management
+
+*   **Dataset Loader:** Standardizes inputs from CSV, JSONL, or `coreason-archive` SQL queries into a `TrainingExample` object.
+*   **Splitter:** automatically creates Train/Dev/Test splits to prevent overfitting the prompt to the training data.
+
+### 4.3 The Manifest Serializer
+
+*   The output must be deterministic and immutable.
+*   **Schema:**
+    ```json
+    {
+      "agent_id": "adverse_event_extractor",
+      "base_model": "gpt-4o",
+      "optimized_instruction": "Extract adverse events... [Modified by Optimizer]",
+      "few_shot_examples": [ ... ],
+      "performance_metric": "0.94",
+      "optimization_run_id": "opt_20250119_xyz"
+    }
+    ```
+
+## 5. Technical Specifications (API)
+
+### 5.1 The Interface
+
+```python
+class OptimizerConfig(BaseModel):
+    target_model: str = "gpt-4o"
+    metric: str = "exact_match"
+    max_bootstrapped_demos: int = 4
+    max_rounds: int = 10
+
+class PromptOptimizer(ABC):
+    @abstractmethod
+    def compile(self,
+                agent: Construct,
+                trainset: List[Example],
+                valset: List[Example]) -> OptimizedManifest:
+        """Run the optimization loop."""
+        pass
+```
+
+### 5.2 The CLI (coreason-opt)
+
+The package should expose a command-line interface for CI/CD integration:
+
+*   `coreason-opt tune --agent src/agents/analyst.py --dataset data/gold_set.csv`
+*   `coreason-opt evaluate --manifest dist/analyst_v2.json --dataset data/test_set.csv`
+
+## 6. Implementation Plan: Atomic Units of Change (AUC)
+
+### Phase 1: Foundation
+
+*   **AUC-1: Scaffold & Configuration:** Project structure, `pyproject.toml`, and `OptimizerConfig` Pydantic models.
+*   **AUC-2: Abstract Base Classes:** Define `BaseOptimizer`, `BaseSelector` (for examples), and `BaseMutator` (for instructions).
+
+### Phase 2: Data & Metrics
+
+*   **AUC-3: Dataset Loader:** Implement `Dataset` class that handles loading/splitting from CSV and `coreason-archive`.
+*   **AUC-4: Metric Adapter:** Create a wrapper that adapts `coreason-assay` functions into the format required by the optimization loop.
+
+### Phase 3: The Strategies
+
+*   **AUC-5: Few-Shot Selector:** Implement logic to select examples using Semantic Similarity (via `coreason-foundry` embeddings) or Random Sampling.
+*   **AUC-6: Bootstrap Logic:** Implement the "Teacher-Student" loop where the model generates its own training data from input questions.
+*   **AUC-7: Instruction Mutator:** Implement the Meta-Prompt that analyzes failures and rewrites the system prompt.
+
+### Phase 4: The Loop & Artifacts
+
+*   **AUC-8: The Compile Loop:** Connect the Mutators and Selectors into the main `compile()` orchestration method.
+*   **AUC-9: Manifest Serializer:** Logic to dump the final state to JSON.
+*   **AUC-10: CLI Entrypoint:** Build the `coreason-opt` command line tool.
+
+## 7. Compliance & Safety
+
+*   **Audit Trail:** Every optimization run must log the `trace_id` of the experiments to `coreason-veritas`. We must be able to explain *why* the prompt changed.
+*   **Human-in-the-Loop Gate:** The `OptimizedManifest` is not automatically deployed. It is saved as a "Candidate" that requires a human to review the score improvement before promotion to production.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -146,3 +146,4 @@ cython_debug/

		# Runtime Logs
		logs/
		optimized_manifest.json