Skip to content

scarolan/music_store_assistant

Repository files navigation

Music Store Assistant - Algorhythm Customer Support Bot

CI License: MIT

A demonstration application showcasing LLM observability with OpenTelemetry and Grafana Cloud.

This project is an SE "art of the possible" demo that shows how to instrument an AI agent application with full observability - tracking costs, performance, and execution flow. Not intended for production use. The demo implements a customer support chatbot with:

  • Multi-agent architecture (Supervisor/Router pattern)
  • Human-in-the-Loop approval for sensitive operations
  • Full OpenTelemetry instrumentation for LLM observability
  • Grafana Cloud integration with pre-built dashboard
  • Multi-provider LLM support (OpenAI, Anthropic, Google, DeepSeek)

Architecture

flowchart TD
    Start((Start)) --> supervisor

    subgraph Routing["🎯 Supervisor Router"]
        supervisor{{"Supervisor<br/>(GPT-4o-mini)"}}
    end

    supervisor -->|"music query"| music_expert
    supervisor -->|"support query"| support_rep

    subgraph Music["🎵 Music Expert"]
        music_expert["Music Expert<br/>(GPT-4o-mini)"]
        music_tools[["🔧 Music Tools<br/>• get_albums_by_artist<br/>• get_tracks_by_artist<br/>• check_for_songs<br/>• get_artists_by_genre<br/>• list_genres"]]
        music_expert -->|"needs data"| music_tools
        music_tools --> music_expert
    end

    subgraph Support["💼 Support Rep"]
        support_rep["Support Rep<br/>(GPT-4o-mini)"]
        support_tools[["🔧 Safe Tools<br/>• get_invoice<br/>• get_customer_profile"]]
        refund_tools[["⚠️ HITL Tools<br/>• process_refund"]]
        
        support_rep -->|"safe operation"| support_tools
        support_rep -->|"refund request"| hitl
        support_tools --> support_rep
        
        subgraph HITL["🛑 Human-in-the-Loop"]
            hitl{{"Interrupt<br/>for Approval"}}
            hitl -->|"approved"| refund_tools
        end
        
        refund_tools --> support_rep
    end

    music_expert -->|"done"| End((End))
    support_rep -->|"done"| End

    style supervisor fill:#4a90d9,stroke:#2d5a87,color:#fff
    style music_expert fill:#50c878,stroke:#2d7a4a,color:#fff
    style support_rep fill:#f5a623,stroke:#c77d0a,color:#fff
    style hitl fill:#e74c3c,stroke:#a93226,color:#fff
    style music_tools fill:#e8f5e9,stroke:#81c784,color:#1a3d1a
    style support_tools fill:#fff3e0,stroke:#ffb74d,color:#5d4e37
    style refund_tools fill:#ffebee,stroke:#ef5350,color:#7a1f1f
Loading

Flow Summary

Component Model Purpose
Supervisor GPT-4o-mini Routes requests to Music Expert or Support Rep
Music Expert GPT-4o-mini Catalog queries - albums, tracks, artists, genres
Support Rep GPT-4o-mini Account info, invoices, refunds
HITL Gate Requires human approval for refunds

Quick Start

1. Prerequisites

  • Python 3.12+ (or use uv python install 3.12)
  • uv package manager
  • OpenAI API key (or Anthropic/Google)
  • Grafana Cloud account (free tier works great)

2. Setup

Clone and install:

git clone https://github.com/scarolan/music_store_assistant
cd music_store_assistant
uv sync

Download the Chinook database:

curl -o Chinook.db https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite

3. Configure Observability

Create a .env file with your configuration (see .env.example for full options):

# Required: LLM Provider
OPENAI_API_KEY=your-openai-key-here

# Required: OTEL Tracing to Grafana Cloud
OTEL_EXPORTER_OTLP_ENDPOINT=https://otlp-gateway-prod-us-central-0.grafana.net/otlp
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic%20<your-base64-credentials>
OTEL_SERVICE_NAME=music-store-assistant

Getting OTEL credentials:

  1. Go to Grafana Cloud → Connections → OpenTelemetry
  2. Copy Instance ID and generate API token
  3. Base64 encode: echo -n "instance_id:api_token" | base64
  4. Use format: Authorization=Basic%20<result>

4. Run the Application

uv run uvicorn src.api:app --host 0.0.0.0 --port 8080

Then open:

5. Install the Dashboard

Import the pre-built dashboard to visualize your LLM application metrics:

  1. In Grafana Cloud, go to DashboardsNewImport
  2. Click Upload dashboard JSON file
  3. Select llm_o11y_dashboard.json from this repository
  4. Choose your Prometheus and Tempo data sources
  5. Click Import

The dashboard provides:

  • Token usage and costs by agent, model, and conversation
  • Performance metrics (latency P50/P95/P99, request rates)
  • Error tracking with failure rates and types
  • Model distribution showing which models handle requests

6. View Traces in Grafana

Explore individual conversation traces:

  1. Go to your Grafana Cloud instance → ExploreTempo
  2. Query: {service.name="music-store-assistant"}
  3. Click on any trace to see the full execution flow:
    • Supervisor routing decisions
    • Agent selection (Music Expert vs Support Rep)
    • Tool executions with inputs/outputs
    • LLM calls with token counts
    • Complete conversation hierarchy

Model Configuration

Models are configured via environment variables with provider auto-detection:

Agent Default Model Why
Supervisor gpt-4o-mini Fast routing decisions
Music Expert gpt-4o-mini Consistent, reliable responses
Support Rep gpt-4o-mini Reliable for account operations
# Override any model:
export MUSIC_EXPERT_MODEL=claude-3-5-haiku-20241022  # Anthropic
export MUSIC_EXPERT_MODEL=gpt-4o                      # OpenAI
export MUSIC_EXPERT_MODEL=deepseek-chat               # Budget option

Available env vars: SUPERVISOR_MODEL, MUSIC_EXPERT_MODEL, SUPPORT_REP_MODEL

Provider is auto-detected from model name prefix (gpt-*, claude-*, gemini-*, deepseek-*).

Usage

Web UI (Recommended for Demos)

Quick Start (launches server + continuous traffic generation):

demo/start_demo.sh   # Starts server + generates traffic for 30 minutes
demo/stop_demo.sh    # Stops everything cleanly

Manual Start:

uv run uvicorn src.api:app --reload --host 0.0.0.0 --port 8000

Then open http://localhost:8000 for the customer chat interface, or http://localhost:8000/admin for the HITL approval dashboard.

Validate Setup:

demo/preflight_check.sh  # Checks all prerequisites (doesn't start anything)

Monitor Traffic:

tail -f /tmp/continuous-traffic.log  # Watch continuous traffic generation

Python API

from src.graph import create_graph

graph = create_graph()

# customer_id is passed via context= (secure, not in state)
result = graph.invoke(
    {"messages": [("user", "What albums does AC/DC have?")]},
    config={},
    context={"customer_id": 16}
)

CLI

uv run python -m src.cli

Testing

Run the full test suite:

uv run pytest

Run specific tests:

uv run pytest -v -k test_refund  # HITL flow tests
uv run pytest -v -k test_music   # Music expert tests

Contributing

This is a demonstration project for Grafana Labs. Issues and pull requests are welcome!

See LICENSE for details.

License

MIT License - see LICENSE file for details.

Questions?

What You'll See: Observability in Action

Once running, you'll have full visibility into your LLM application:

📊 In Grafana Cloud

  • Token usage and costs - Track spend per conversation, agent, and model
  • Performance metrics - P50/P95/P99 latency for each operation
  • Trace hierarchy - See every LLM call, tool execution, and state transition
  • Error tracking - Identify and debug failures with full context
  • Custom dashboard - Pre-built panels for key metrics

🔍 Example Queries

Try these in the chat interface:

  • "What albums does AC/DC have?" (routes to Music Expert)
  • "Show me my recent orders" (routes to Support Rep)
  • "I want a refund for invoice 98" (triggers HITL approval)

Watch the traces appear in Grafana in real-time!

Project Structure

├── src/
│   ├── graph.py        # LangGraph definition + model factory
│   ├── state.py        # State schemas
│   ├── api.py          # FastAPI backend + HITL management
│   ├── cli.py          # Interactive CLI
│   ├── otel.py         # 🔭 OpenTelemetry configuration + filtering
│   ├── utils.py        # Database utilities
│   └── tools/
│       ├── music.py    # Read-only catalog tools
│       └── support.py  # Sensitive write tools (HITL)
├── static/
│   ├── index.html      # Customer chat interface
│   └── admin.html      # HITL approval dashboard
├── tests/              # Pytest suite (80+ tests)
├── llm_o11y_dashboard.json  # 📈 Grafana dashboard (import me!)
├── CLAUDE.md                # AI assistant context guide
├── Chinook.db               # SQLite music catalog
└── .env.example             # Configuration template

Documentation

  • CLAUDE.md - Comprehensive codebase guide for AI assistants
  • ARCHITECTURE.md - Detailed system architecture and patterns

Key Features

🎯 Agentic Patterns

  • Supervisor/Router: Intent classification and routing
  • Specialized agents: Music Expert (read-only) and Support Rep (with HITL)
  • Tool calling: Database queries and business logic
  • Human-in-the-Loop: Approval workflow for sensitive operations

🔭 Observability Features

  • OpenTelemetry instrumentation: Auto-instrumentation with OpenInference
  • Attribute filtering: Keeps traces lean (5-10KB vs 50-100KB raw)
  • Grafana Cloud export: OTLP/HTTP to Tempo
  • Pre-built dashboard: Token costs, latency, errors, model distribution

🔒 Security Patterns

  • Context schema: Customer ID passed securely (not in LLM-accessible state)
  • Scoped queries: Tools automatically filter by authenticated customer
  • HITL gate: Only sensitive operations require approval

About

An AI powered customer service assistant for an online music store

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors