Granite Switch Tutorials

Granite Switch facilitates a modular architecture by consolidating multiple LoRA adapters into a single, unified checkpoint. The following tutorials explore the underlying mechanics and usability, detailing adapter function invocation, multi-step pipelines with guardrails, and checkpoint composition.

Notebooks

Step-by-step walkthroughs covering adapter function invocation, pipeline construction, and model composition.

Notebook	Topics	Duration
hello_mellea.ipynb	Mellea adapter functions intro with vLLM	5 min
rag_101.ipynb	RAG 101: build a vector corpus and run a basic answerability check	15 min
rag_flow.ipynb	Full RAG flow with guardian checks (harm + scope)	30 min
compose_granite_switch.ipynb	Compose a checkpoint from adapter libraries	15 min
alora_vs_lora_race.ipynb	ALORA vs LoRA race: side-by-side throughput comparison on a multi-step RAG pipeline	20 min
hello_adapter.ipynb	Minimal adapter function invocation with HuggingFace	5 min
granite_switch_with_hf.ipynb	Compose + HuggingFace backend, `adapter_name=` invocation, Core + Guardian adapter functions in a multi-turn conversation	10 min
granite_speech_demo.ipynb	Real-time voice assistant: Granite Speech STT + Granite Switch LLM + Granite Libraries validation, orchestrated by Mellea over WebRTC	10 min

Guides

Guide	Description
Using Mellea with Granite Switch	Connect Mellea to a Granite Switch model
Bring Your Own Adapter	Train, compose, and use custom adapters
Compare Inference Throughput	Compare LoRA vs aLoRA based models in an inference race setup

Learning Paths

Composing Models

for the following learning path we will use pre built models HuggingFace you can compose your on model see Path 3 Compose your own model

Path 1: Inference with Mellea (Recommended)

Best for: All inference use cases — development through production

Mellea is the correct way to invoke Granite Switch capabilities. It handles constrained decoding, prompt rewriting, and input/output processing automatically. Currently supports vLLM; HuggingFace support coming soon.

Path 2: Real-World Pipelines (Usability)

Best for: Seeing how adapter functions compose into multi-step applications

RAG 101 - corpus build + answerability check, the smallest end-to-end RAG demo
Full RAG Flow with Guardians - rewrite, answerability, citations, harm + scope checks

Path 3: Bring Your Own Adapter

Best for: Custom adapter function development

Path 4: Low-Level Understanding (HuggingFace)

Best for: Understanding how Granite Switch works at the control-token level

HuggingFace inference examples demonstrate how adapter functions are activated via control tokens, providing insight into the underlying mechanics. For most applications, we recommend running inference with Mellea (Part 2).

Prerequisites
Hello Adapter — see control tokens in action
Granite Switch with HuggingFace — detailed walkthrough

Reference Scripts

Runnable scripts in scripts/ for common tasks:

Script	Description
run_adapter_generation_direct.py	Direct adapter function invocation via control tokens
run_adapter_generation_mellea.py	Adapter function invocation through Mellea

Adapter Libraries

Granite Switch checkpoints embed adapters drawn from IBM's granitelib libraries. The three libraries below are featured throughout these tutorials:

Adapter	Purpose	Where used in tutorials	HF repo
Core	Foundational post-generation adapter functions: certainty scoring, requirement checking, and response attribution.	granite_switch_with_hf, compose_granite_switch	ibm-granite/granitelib-core-r1.0
RAG	Retrieval-augmented generation adapter functions: query rewrite, answerability, hallucination detection, and citation generation.	hello_mellea, rag_101, rag_flow, compose_granite_switch	ibm-granite/granitelib-rag-r1.0
Guardian	Safety and risk detection: harm, social bias, jailbreaking, factuality, and policy compliance checks.	hello_adapter, hello_mellea, granite_switch_with_hf, rag_flow, compose_granite_switch	ibm-granite/granitelib-guardian-r1.0

External Resources

Resource	Description
Mellea	IBM's library for writing Generative Programs
Granite aLoRA Adapters	Official adapter libraries on HuggingFace
vLLM Documentation	High-performance inference
Granite Models	Base Granite models

Reference Documentation

For technical details, see docs/:

Supported Models — Model compatibility
Git Workflow — Contribution guidelines

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Granite Switch Tutorials

Notebooks

Guides

Learning Paths

Composing Models

Path 1: Inference with Mellea (Recommended)

Path 2: Real-World Pipelines (Usability)

Path 3: Bring Your Own Adapter

Path 4: Low-Level Understanding (HuggingFace)

Reference Scripts

Adapter Libraries

External Resources

Reference Documentation

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Granite Switch Tutorials

Notebooks

Guides

Learning Paths

Composing Models

Path 1: Inference with Mellea (Recommended)

Path 2: Real-World Pipelines (Usability)

Path 3: Bring Your Own Adapter

Path 4: Low-Level Understanding (HuggingFace)

Reference Scripts

Adapter Libraries

External Resources

Reference Documentation