Skip to content

Latest commit

 

History

History
114 lines (69 loc) · 9.01 KB

File metadata and controls

114 lines (69 loc) · 9.01 KB

Granite Switch Tutorials

Granite Switch facilitates a modular architecture by consolidating multiple LoRA adapters into a single, unified checkpoint. The following tutorials explore the underlying mechanics and usability, detailing adapter function invocation, multi-step pipelines with guardrails, and checkpoint composition.

Notebooks

Step-by-step walkthroughs covering adapter function invocation, pipeline construction, and model composition.

Notebook Topics Duration Colab
hello_mellea.ipynb Mellea adapter functions intro with vLLM 5 min Open In Colab
rag_101.ipynb RAG 101: build a vector corpus and run a basic answerability check 15 min Open In Colab
rag_flow.ipynb Full RAG flow with guardian checks (harm + scope) 30 min Open In Colab
compose_granite_switch.ipynb Compose a checkpoint from adapter libraries 15 min
alora_vs_lora_race.ipynb ALORA vs LoRA race: side-by-side throughput comparison on a multi-step RAG pipeline 20 min Open In Colab
hello_adapter.ipynb Minimal adapter function invocation with HuggingFace 5 min Open In Colab
granite_switch_with_hf.ipynb Compose + HuggingFace backend, adapter_name= invocation, Core + Guardian adapter functions in a multi-turn conversation 10 min Open In Colab
granite_speech_demo.ipynb Real-time voice assistant: Granite Speech STT + Granite Switch LLM + Granite Libraries validation, orchestrated by Mellea over WebRTC 10 min Open In Colab

Guides

Guide Description
Using Mellea with Granite Switch Connect Mellea to a Granite Switch model
Bring Your Own Adapter Train, compose, and use custom adapters
Compare Inference Throughput Compare LoRA vs aLoRA based models in an inference race setup

Learning Paths

Composing Models

for the following learning path we will use pre built models HuggingFace you can compose your on model see Path 3 Compose your own model

Path 1: Inference with Mellea (Recommended)

Best for: All inference use cases — development through production

Mellea is the correct way to invoke Granite Switch capabilities. It handles constrained decoding, prompt rewriting, and input/output processing automatically. Currently supports vLLM; HuggingFace support coming soon.

  1. Prerequisites
  2. Hello Mellea Open In Colab

Path 2: Real-World Pipelines (Usability)

Best for: Seeing how adapter functions compose into multi-step applications

  1. RAG 101 - corpus build + answerability check, the smallest end-to-end RAG demo Open In Colab
  2. Full RAG Flow with Guardians - rewrite, answerability, citations, harm + scope checks Open In Colab

Path 3: Bring Your Own Adapter

Best for: Custom adapter function development

  1. Bring Your Own Adapter Guide
  2. Configure Your Own Adapter Guide
  3. Compose Your Checkpoint

Path 4: Low-Level Understanding (HuggingFace)

Best for: Understanding how Granite Switch works at the control-token level

HuggingFace inference examples demonstrate how adapter functions are activated via control tokens, providing insight into the underlying mechanics. For most applications, we recommend running inference with Mellea (Part 2).

  1. Prerequisites
  2. Hello Adapter — see control tokens in action Open In Colab
  3. Granite Switch with HuggingFace — detailed walkthrough Open In Colab

Reference Scripts

Runnable scripts in scripts/ for common tasks:

Script Description
run_adapter_generation_direct.py Direct adapter function invocation via control tokens
run_adapter_generation_mellea.py Adapter function invocation through Mellea

Adapter Libraries

Granite Switch checkpoints embed adapters drawn from IBM's granitelib libraries. The three libraries below are featured throughout these tutorials:

Adapter Purpose Where used in tutorials HF repo
Core Foundational post-generation adapter functions: certainty scoring, requirement checking, and response attribution. granite_switch_with_hf, compose_granite_switch ibm-granite/granitelib-core-r1.0
RAG Retrieval-augmented generation adapter functions: query rewrite, answerability, hallucination detection, and citation generation. hello_mellea, rag_101, rag_flow, compose_granite_switch ibm-granite/granitelib-rag-r1.0
Guardian Safety and risk detection: harm, social bias, jailbreaking, factuality, and policy compliance checks. hello_adapter, hello_mellea, granite_switch_with_hf, rag_flow, compose_granite_switch ibm-granite/granitelib-guardian-r1.0

External Resources

Resource Description
Mellea IBM's library for writing Generative Programs
Granite aLoRA Adapters Official adapter libraries on HuggingFace
vLLM Documentation High-performance inference
Granite Models Base Granite models

Reference Documentation

For technical details, see docs/: