Skip to content

Latest commit

 

History

History
500 lines (363 loc) · 16.4 KB

File metadata and controls

500 lines (363 loc) · 16.4 KB

Contributing to Mellea

Thank you for your interest in contributing to Mellea! This guide will help you get started with developing and contributing to the project.

Contribution Pathways

There are several ways to contribute to Mellea:

1. Contributing to This Repository

Contribute to the Mellea core, standard library, or fix bugs. This includes:

  • Core features and bug fixes
  • Standard library components (Requirements, Components, Sampling Strategies)
  • Backend improvements and integrations
  • Documentation and examples
  • Tests and CI/CD improvements

Process: See the Pull Request Process section below for detailed steps.

2. Applications & Libraries

Build tools and applications using Mellea. These can be hosted in your own repository. For observability, use a mellea- prefix.

Examples:

  • github.com/my-company/mellea-legal-utils
  • github.com/my-username/mellea-swe-agent

3. Community Components

Contribute experimental or specialized components to mellea-contribs.

Note: For general-purpose Components, Requirements, or Sampling Strategies, please open an issue first to discuss whether they should go in the standard library (this repository) or mellea-contribs.

Code of Conduct

This project adheres to the Contributor Covenant Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to melleaadmin@ibm.com.

Getting Started

Prerequisites

Installation with uv (Recommended)

  1. Fork and clone the repository:

    git clone ssh://git@github.com/<your-username>/mellea.git
    cd mellea/
  2. Setup virtual environment:

    uv venv .venv
    source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  3. Install dependencies:

    # Install all dependencies (recommended for development)
    uv sync --all-extras --all-groups
    
    # Or install just the backend dependencies
    uv sync --extra backends --all-groups
  4. Install pre-commit hooks (Required):

    pre-commit install

    Note: Some hooks require tools in dev dependency groups to be on your PATH. Activate the virtual environment before committing to ensure they are available:

    source .venv/bin/activate

Installation with conda/mamba

  1. Fork and clone the repository:

    git clone ssh://git@github.com/<your-username>/mellea.git
    cd mellea/
  2. Run the installation script:

    conda/install.sh

This script handles environment setup, dependency installation, and pre-commit hook installation.

Verify Installation

# Start Ollama (required for most tests)
ollama serve

# Run fast tests (skip qualitative tests, ~2 min)
uv run pytest -m "not qualitative"

Directory Structure

Path Contents
mellea/core Core abstractions: Backend, Base, Formatter, Requirement, Sampling
mellea/stdlib Standard library: Session, Context, Components, Requirements, Sampling, Intrinsics, Tools
mellea/backends Backend providers: HF, OpenAI, Ollama, Watsonx, LiteLLM
mellea/formatters Output formatters and parsers
mellea/helpers Utilities, logging, model ID tables
mellea/templates Jinja2 templates for prompts
cli/ CLI commands (m serve, m alora, m decompose, m eval)
test/ All tests (run from repo root)
docs/ Documentation, examples, tutorials

Coding Standards

Type Annotations

Required on all core functions:

def process_text(text: str, max_length: int = 100) -> str:
    """Process text with maximum length."""
    return text[:max_length]

Docstrings

Docstrings are prompts - the LLM reads them, so be specific.

Use Google-style docstrings:

def extract_entities(text: str, entity_types: list[str]) -> dict[str, list[str]]:
    """Extract named entities from text.

    Args:
        text: The input text to analyze.
        entity_types: List of entity types to extract (e.g., ["PERSON", "ORG"]).

    Returns:
        Dictionary mapping entity types to lists of extracted entities.

    Example:
        >>> extract_entities("Alice works at IBM", ["PERSON", "ORG"])
        {"PERSON": ["Alice"], "ORG": ["IBM"]}
    """
    ...

Class and __init__ docstrings

Place Args: on the class docstring only. The __init__ docstring should be a single summary sentence with no Args: section. This keeps hover docs clean in IDEs and ensures the docs pipeline (which skips __init__) publishes the full parameter list.

class MyComponent(Component[str]):
    """A component that does something useful.

    Args:
        name (str): Human-readable label for this component.
        max_tokens (int): Upper bound on generated tokens.
    """

    def __init__(self, name: str, max_tokens: int = 256) -> None:
        """Initialize MyComponent with a name and token budget."""
        self.name = name
        self.max_tokens = max_tokens

Add an Attributes: section on the class docstring only when a stored attribute differs in type or behaviour from the constructor input — for example, when a str argument is wrapped into a CBlock, or when a class-level constant is relevant to callers. Pure-echo entries that repeat Args: verbatim should be omitted.

TypedDict classes are a special case. Their fields are the entire public contract, so when an Attributes: section is present it must exactly match the declared fields. The audit will flag:

  • typeddict_phantomAttributes: documents a field that is not declared in the TypedDict
  • typeddict_undocumented — a declared field is absent from the Attributes: section
class ConstraintResult(TypedDict):
    """Result of a constraint check.

    Attributes:
        passed: Whether the constraint was satisfied.
        reason: Human-readable explanation.
    """
    passed: bool
    reason: str

Validating docstrings

Run the coverage and quality audit to check your changes before committing:

# Build fresh API docs then audit quality (documented symbols only)
uv run python tooling/docs-autogen/generate-ast.py
uv run python tooling/docs-autogen/audit_coverage.py \
    --quality --no-methods --docs-dir docs/docs/api

Key checks the audit enforces:

Check Meaning
no_class_args Class has typed __init__ params but no Args: on the class docstring
duplicate_init_args Args: appears in both the class and __init__ docstrings (Option C violation)
no_args Standalone function has params but no Args: section
no_returns Function has a non-trivial return annotation but no Returns: section
param_mismatch Args: documents names not present in the actual signature
typeddict_phantom TypedDict Attributes: documents a field not declared in the class
typeddict_undocumented TypedDict has a declared field absent from its Attributes: section

IDE hover verification — open any of these existing classes in VS Code and hover over the class name or a constructor call to confirm the hover card shows Args: once with no duplication:

Code Style

  • Ruff for linting and formatting
  • Use ... in @generative function bodies
  • Prefer primitives over classes for simplicity
  • Keep functions focused and single-purpose
  • Avoid over-engineering

Formatting and Linting

# Format code
uv run ruff format .

# Lint code
uv run ruff check .

# Fix auto-fixable issues
uv run ruff check --fix .

# Type check
uv run mypy .

Development Workflow

Commit Messages

Follow Angular commit format:

<type>: <subject>

<body>

<footer>

Types: feat, fix, docs, test, refactor, release

Example:

feat: add support for streaming responses

Implements streaming for all backend types with proper
error handling and timeout management.

Closes #123

Important: Always sign off commits using -s or --signoff:

git commit -s -m "feat: your commit message"

AI Coding Assistants

AI-assisted development is welcome. You are responsible for reviewing and understanding every change before submitting.

AI coding assistants following project guidelines add an Assisted-by: trailer to commit messages by default, identifying which tool was used:

Assisted-by: Claude Code
Assisted-by: IBM Bob

Add one line per tool used, using its common name (GitHub Copilot, Cursor, etc.).

Pre-commit Hooks

Pre-commit hooks run automatically before each commit and check:

  • Ruff - Linting and formatting
  • mypy - Type checking
  • uv-lock - Dependency lock file sync
  • codespell - Spell checking

Bypass hooks (for intermediate commits):

git commit -n -m "wip: intermediate work"

Run hooks manually:

pre-commit run --all-files

⚠️ Warning: pre-commit --all-files may take several minutes. Don't cancel mid-run as it can corrupt state.

Pull Request Process

  1. Create an issue describing your change (if not already exists)
  2. Fork the repository (if you haven't already)
  3. Create a branch in your fork using appropriate naming
  4. Make your changes following coding standards
  5. Add tests for new functionality
  6. Run the test suite to ensure everything passes
  7. Update documentation as needed
  8. Push to your fork and create a pull request to the main repository
  9. Follow the automated PR workflow instructions

Testing

Quick Reference

# Install all dependencies (required for tests)
uv sync --all-extras --all-groups

# Start Ollama (required for most tests)
ollama serve

# Default: qualitative tests, skip slow tests
uv run pytest

# Fast tests only (no qualitative, ~2 min)
uv run pytest -m "not qualitative"

# Unit tests only (self-contained, no services)
uv run pytest -m unit

# Run only slow tests (>1 min)
uv run pytest -m slow

# Run specific backend tests
uv run pytest -m "ollama"
uv run pytest -m "openai"

# CI/CD mode (skips qualitative tests)
CICD=1 uv run pytest

# Lint and format
uv run ruff format .
uv run ruff check .

Required Models

Ollama

HuggingFace and cloud backends download or host models automatically. Ollama models must be pulled locally before running the tests that need them.

CI (unit + integration tests):

  • granite4:micro — default model for start_session() and most examples
  • granite4:micro-h — hybrid variant used by conftest fixtures

Examples (docs/examples/):

  • deepseek-r1:8b — safety / guardian examples
  • granite3-guardian:2b — mini-researcher guardian backend
  • granite3.2-vision — vision (Ollama chat) example
  • granite3.3:8b — m_decompose example
  • granite4:latest — melp examples
  • llama3.2 — repair-with-guardian example
  • llama3.2:3b — tutorial / mify examples (via META_LLAMA_3_2_3B)
  • qwen2.5vl:7b — vision (OpenAI-via-Ollama) example

Additional test models (test/):

  • granite4:small-h — hybrid-small tests
  • llama3.2:1b — lightweight inference tests
  • llama3:8b — legacy Llama 3 tests
  • llava — multimodal tests
  • mistral:7b — Mistral backend tests
  • smollm2:1.7b — SmolLM tests

Pull everything:

for m in granite4:micro granite4:micro-h deepseek-r1:8b \
  granite3-guardian:2b granite3.2-vision granite3.3:8b granite4:latest \
  llama3.2 llama3.2:3b \
  qwen2.5vl:7b granite4:small-h llama3.2:1b llama3:8b llava mistral:7b \
  smollm2:1.7b; do ollama pull "$m"; done

Test Markers

Tests use a four-tier granularity system (unit, integration, e2e, qualitative) plus backend and resource markers. See test/MARKERS_GUIDE.md for the full marker reference, including tier definitions, backend markers, resource gates, and auto-skip logic.

CI/CD Tests

CI runs the following checks on every pull request:

  1. Pre-commit hooks (pre-commit run --all-files) - Ruff, mypy, uv-lock, codespell
  2. Test suite (CICD=1 uv run pytest) - Skips qualitative tests for speed

To replicate CI locally:

# Run pre-commit checks (same as CI)
pre-commit run --all-files

# Run tests with CICD flag (same as CI, skips qualitative tests)
CICD=1 uv run pytest

Timing Expectations

  • Fast tests (-m "not qualitative"): ~2 minutes
  • Default tests (qualitative, no slow): Several minutes
  • Slow tests (-m slow): >1 minute each
  • Pre-commit hooks: 1-5 minutes

⚠️ Don't cancel mid-run - canceling pytest or pre-commit can corrupt state.

Common Issues & Troubleshooting

Problem Fix
ComponentParseError LLM output didn't match expected type. Add examples to docstring.
uv.lock out of sync Run uv sync to update lock file.
Ollama refused connection Run ollama serve to start Ollama server.
ConnectionRefusedError (port 11434) Ollama not running. Start with ollama serve.
TypeError: missing positional argument First argument to @generative function must be session m.
Output is wrong/None Model too small or needs better prompt. Try larger model or add reasoning field.
error: can't find Rust compiler Python 3.13+ requires Rust for outlines. Install Rust or use Python 3.12.
Tests fail on Intel Mac Use conda: conda install 'torchvision>=0.22.0' then uv pip install mellea.
Pre-commit hooks fail Run pre-commit run --all-files to see specific issues. Fix or use git commit -n to bypass. If a tool reports command not found, activate the virtual environment before committing: source .venv/bin/activate.

Debugging Tips

# Enable debug logging
from mellea.core import MelleaLogger
MelleaLogger.get_logger().setLevel("DEBUG")

# See exact prompt sent to LLM
print(m.last_prompt())

Getting Help

Additional Resources

Documentation

Community

Related Repositories


Feedback Loop

Found a bug, workaround, or pattern while contributing?

Help us improve this guide by opening a PR with your additions!


Thank you for contributing to Mellea! 🎉