Skip to content

davidlarrimore/openwebui-chat-analyzer

Repository files navigation

πŸ’¬ Open WebUI Chat Analyzer

A local-first analytics platform for exploring your Open WebUI conversations

Transform your Open WebUI chat history into actionable insights with this comprehensive analytics stack. Featuring a FastAPI backend paired with a modern Next.js dashboard, your conversation data never leaves your environmentβ€”making it perfect for privacy-conscious teams and individual power users.

License: MIT


✨ Key Features

πŸ”’ Privacy-First Architecture

  • 100% Local Processing – All data stays on your machine
  • No External Services – Dashboard communicates only with your local backend
  • Self-Hosted by Design – Complete control over your conversation analytics
  • Adaptive Alias System – Stable pseudonyms are stored in the database with an optional real-name override

πŸ“Š Comprehensive Analytics

  • πŸ“ˆ Time Analysis – Daily trends, conversation patterns, hour-by-day heatmaps
  • πŸ“ Content Analysis – Word clouds, message length distributions, sentiment breakdown
  • πŸ’¬ Chat Browser – Full-text search, filters, and detailed conversation views
  • πŸ” Advanced Search – Query across all messages with powerful filtering options

πŸš€ Intelligent Data Loading

  • Direct Connect – Sync live from your Open WebUI instance with one click
  • File Import – Drop exports into data/ or upload through the UI
  • Instant Metrics – Dashboard updates immediately while summaries process in background
  • Incremental Sync – Smart updates that only fetch new conversations

πŸ€– AI-Powered Summaries

  • Local LLM Integration – Uses Ollama for automatic chat summarization
  • Incremental Persistence – Summaries saved as each chat completes (no data loss)
  • Smart Context – Sentence transformers identify salient utterances for better summaries
  • Fallback Support – Can use Open WebUI completions endpoint if needed

🎨 Modern UI/UX

  • Next.js 14 App Router – Fast, responsive single-page application
  • Tailwind + shadcn/ui – Beautiful, accessible component library
  • Real-Time Updates – Live processing logs and progress tracking
  • Multi-User Support – Auth.js with credentials and GitHub OAuth

🎯 Quick Start

Option A: Docker (Recommended)

git clone https://github.com/davidlarrimore/openwebui-chat-analyzer.git
cd openwebui-chat-analyzer
cp .env.example .env
make up

Access Points:

Useful Commands:

make logs    # View combined logs
make down    # Stop all services
make restart # Restart services
make help    # See all available commands

Option B: Local Development

Backend:

git clone https://github.com/davidlarrimore/openwebui-chat-analyzer.git
cd openwebui-chat-analyzer
python3 -m venv venv && source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
cp .env.example .env

# First run only
python -m textblob.download_corpora

# Start backend
uvicorn backend.app:app --reload --port 8502

Frontend (in a new terminal):

cd openwebui-chat-analyzer/frontend-next
pnpm install
pnpm dev  # Runs on http://localhost:3000

Option C: Guided Setup

scripts/setup.sh  # Interactive wizard for Docker or local setup

πŸ“– Dashboard Overview

πŸ“Š Overview

  • Total conversations, messages, and user activity
  • Model usage statistics and file upload tracking
  • Approximate token volume (derived from character counts)
  • User and model breakdowns with visual charts

πŸ“ˆ Time Analysis

  • Daily Trends – Message volume over time
  • Conversation Length – Distribution of chat durations
  • Heatmaps – Activity by hour and day of week
  • Filters – Segment by user and model

πŸ“ Content Analysis

  • Word Clouds – Most frequently used terms
  • Message Length – Histograms by role and model
  • Sentiment Breakdown – Positive, neutral, negative classification
  • Per-User Insights – Individual communication patterns

πŸ” Search

  • Full-Text Search – Query across all messages
  • Advanced Filters – By user, model, date range, sentiment
  • Export Results – Download filtered data as CSV or JSON

πŸ’¬ Browse Chats

  • Paginated View – Browse all conversations
  • Rich Metadata – Timestamps, participants, model info
  • AI Summaries – One-line headlines for each chat
  • Quick Actions – Download individual threads as JSON

βš™οΈ Configuration

  • Data Source Management – Connect to Open WebUI or upload exports
  • Sync Settings – Configure full vs incremental sync modes
  • Automated Scheduler – Set up periodic data refreshes
  • Summarizer Settings – Choose Ollama model for AI summaries
  • Identity Privacy – Toggle between pseudonyms and real names on user-facing charts
  • Real-Time Logs – Monitor sync and processing operations
  • System Status – View connection health and data freshness

πŸ”§ Configuration

Environment Setup

Copy .env.example to .env and configure:

Backend Connectivity

OWUI_API_BASE_URL=http://localhost:8502       # Backend URL for dashboard
OWUI_API_ALLOWED_ORIGINS=http://localhost:3000 # CORS origins
OWUI_DATA_DIR=./data                           # Default export directory

Direct Connect Defaults

OWUI_DIRECT_HOST=http://localhost:3000         # Open WebUI base URL
OWUI_DIRECT_API_KEY=                          # Optional prefill API key
OWUI_EXPOSE_REAL_NAMES=false                  # Set true to expose real names by default

AI & Summarization

# Sentence Transformers
EMB_MODEL=sentence-transformers/all-MiniLM-L6-v2
SALIENT_K=10                                   # Number of salient utterances

# Ollama Configuration
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_TIMEOUT=180
OLLAMA_DEFAULT_MODEL=llama3.2:latest
OLLAMA_EMBEDDING_MODEL=nomic-embed-text:latest

Frontend Settings

FRONTEND_NEXT_PORT=8503                        # Published dashboard port
FRONTEND_NEXT_PUBLIC_URL=http://localhost:8503 # External URL
FRONTEND_NEXT_BACKEND_BASE_URL=http://backend:8502 # Internal backend URL

# Auth.js
NEXTAUTH_SECRET=your-secret-here
NEXTAUTH_URL=http://localhost:8503

# Optional GitHub OAuth
GITHUB_OAUTH_ENABLED=false
GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=

Identity Privacy

  • Pseudonym Catalog – backend/data/pseudonyms.json contains the canonical alias list used when ingesting users.
  • Stable Assignments – Pseudonyms are persisted in the database and refreshed automatically on every sync.
  • Configurable Exposure – Toggle between pseudonyms and real names from the βš™οΈ Configuration page or set OWUI_EXPOSE_REAL_NAMES=true to default to real names.

πŸ“₯ Loading Data

Method 1: Direct Connect (Recommended)

  1. Navigate to βš™οΈ Configuration in the dashboard
  2. Click Edit Credentials and enter:
    • Your Open WebUI base URL (e.g., http://localhost:3000)
    • An API key with read permissions
  3. Click Test Connection to verify
  4. Click Sync Data Now to import

Benefits:

  • βœ… Automatic incremental updates
  • βœ… Always in sync with your Open WebUI instance
  • βœ… No manual export/import workflow
  • βœ… Scheduler support for automated syncs

Method 2: File Upload

  1. Export from Open WebUI:

    • Settings β†’ Data & Privacy β†’ Export All Chats (all-chats-export-*.json)
    • Settings β†’ Database β†’ Export Users (users.csv, optional)
    • Capture /api/v1/models as models.json (optional, for friendly names)
  2. Import options:

    • Drop files in the data/ directory (auto-loaded on startup)
    • Upload through βš™οΈ Configuration page
    • Files stored in uploads/ directory

πŸ€– AI Summaries

How It Works

The analyzer automatically generates one-line summaries for each conversation:

  1. Salient Extraction – Uses sentence-transformers to identify key utterances
  2. LLM Summarization – Feeds context to your configured Ollama model
  3. Incremental Persistence – Saves each summary immediately (no data loss)
  4. Background Processing – Metrics update instantly; summaries generate async

Configuration

Choose your summarization model in βš™οΈ Configuration β†’ Summarizer Settings:

  • Select from available Ollama models
  • Settings persist to database
  • Changes apply to future summarization jobs

Rebuilding Summaries

Regenerate all summaries anytime:

  • Click βš™οΈ Configuration β†’ Quick Actions β†’ Rebuild Summaries
  • Or via API: POST /api/v1/summaries/rebuild
  • Monitor progress in the Processing Log

πŸ”„ Sync Modes

Full Sync

  • When to Use: First sync, changing data sources, or recovering from issues
  • What It Does: Replaces all local data with fresh import from Open WebUI
  • Recommended: When has_data: false or hostname changes

Incremental Sync

  • When to Use: Regular updates from the same Open WebUI instance
  • What It Does: Fetches only new conversations since last sync
  • Recommended: When has_data: true and source matches
  • Benefits: Faster, preserves local summaries, efficient

The dashboard automatically recommends the appropriate mode based on your current dataset state.


πŸ”Œ Backend API

Key endpoints for integration and automation:

Dataset & Metadata

GET  /api/v1/datasets/meta          # Current dataset stats
GET  /api/v1/chats                  # Chat metadata
GET  /api/v1/messages               # Message content
GET  /api/v1/users                  # User directory
POST /api/v1/datasets/reset         # Delete all data

Direct Connect

POST /api/v1/openwebui/sync         # Sync from Open WebUI
POST /api/v1/openwebui/test         # Test connection
GET  /api/v1/sync/status            # Sync status & freshness

File Uploads

POST /api/v1/uploads/chat-export    # Upload all-chats-export.json
POST /api/v1/uploads/users          # Upload users.csv
POST /api/v1/uploads/models         # Upload models.json

Summaries

GET  /api/v1/summaries/status       # Current summarizer status
POST /api/v1/summaries/rebuild      # Regenerate all summaries
GET  /api/v1/summaries/events       # Stream summary events

Admin Settings

GET  /api/v1/admin/settings/direct-connect     # Get Direct Connect settings
PUT  /api/v1/admin/settings/direct-connect     # Update settings
GET  /api/v1/sync/scheduler                    # Get scheduler config
POST /api/v1/sync/scheduler                    # Update scheduler

Interactive API Docs: Visit http://localhost:8502/docs when backend is running


πŸ“Š Data Export

CSV Downloads

  • What's Included: Same columns shown in dashboard tables
  • Use Cases: Analysis in Excel, pandas, Tableau, Power BI
  • Fields: Timestamps, participants, sentiment scores, token estimates

JSON Downloads

  • What's Included: Complete conversation metadata and messages
  • Format: ISO timestamps, attachments, role information
  • Use Cases: Backup, data migration, custom processing

Notes

  • Token estimates are heuristic (based on character counts)
  • Sentiment scores use TextBlob polarity scale (βˆ’1 to 1)
  • Exports reflect current filter/search state

πŸ§ͺ Sample Data

Explore the dashboard instantly with sample data:

cp sample_data/sample_data_extract.json data/
cp sample_data/sample_users.csv data/
# Restart backend to auto-load, or upload via Configuration page

πŸ—οΈ Project Structure

openwebui-chat-analyzer/
β”œβ”€β”€ backend/                 # FastAPI application
β”‚   β”œβ”€β”€ app.py              # Main application entry point
β”‚   β”œβ”€β”€ routes.py           # API endpoint definitions
β”‚   β”œβ”€β”€ services.py         # Business logic & data processing
β”‚   β”œβ”€β”€ db.py               # SQLite database layer
β”‚   β”œβ”€β”€ models.py           # Pydantic models
β”‚   β”œβ”€β”€ summarizer.py       # AI summarization pipeline
β”‚   └── tests/              # Backend test suite
β”‚
β”œβ”€β”€ frontend-next/          # Next.js 14 dashboard
β”‚   β”œβ”€β”€ app/                # App Router pages & layouts
β”‚   β”œβ”€β”€ components/         # React components
β”‚   β”œβ”€β”€ lib/                # Utilities, types, API client
β”‚   └── tests/              # Frontend test suite
β”‚
β”œβ”€β”€ data/                   # Default export directory
β”œβ”€β”€ uploads/                # User-uploaded files
β”œβ”€β”€ scripts/                # Setup & utility scripts
β”œβ”€β”€ sample_data/            # Example datasets
β”œβ”€β”€ .env.example            # Environment template
└── docker-compose.yml      # Container orchestration

πŸ” Privacy & Security

Data Handling

  • βœ… 100% Local – All processing happens on your machine
  • βœ… No External Calls – Dashboard only talks to local backend
  • βœ… No Telemetry – Zero tracking or analytics collection
  • βœ… File-Based Storage – SQLite database in your project directory

Credential Management

  • πŸ”’ API keys stored in database with quote-safe normalization
  • πŸ”’ Password fields in UI (type="password")
  • πŸ”’ Redacted logging (shows supe...2345 instead of full key)
  • πŸ”’ Keys never appear in processing logs or responses

Authentication

  • FastAPI-managed local + OIDC sessions (AUTH_MODE = DEFAULT, HYBRID, or OAUTH)
  • Secure HttpOnly cookies with automatic refresh rotation and admin-controlled revocation
  • Microsoft Entra ID support via the /api/backend/auth/oidc/* flows
  • Middleware-protected dashboard routes that preserve the original callback URL

🧩 Advanced Features

Automatic Sync Scheduler

  • Configure periodic incremental syncs (5 min to 24 hours)
  • Enable/disable via βš™οΈ Configuration β†’ Scheduler
  • Settings persist across restarts
  • Runs in background without blocking dashboard

Data Freshness Indicators

  • Staleness Threshold: Configurable via SYNC_STALENESS_THRESHOLD_HOURS (default: 6 hours)
  • Visual Pills: Green "Current" / Amber "Stale" indicators
  • Last Sync Display: Human-readable timestamps with relative time

Processing Log Viewer

  • Real-Time Streaming: Polls /api/v1/logs every 2 seconds
  • Auto-Scroll: Follows new entries (disable by scrolling up)
  • Structured Logs: Timestamp, level, phase, job ID, message, details
  • Circular Buffer: Retains last 200 events
  • Color-Coded Levels: Debug, info, warning, error

WAL Mode & Performance

  • SQLite Write-Ahead Logging for better concurrency
  • Foreign key enforcement for data integrity
  • Normal synchronous mode for speed with safety
  • Prevents long locks during large syncs

πŸ› οΈ Development

Frontend Development

cd frontend-next
pnpm dev        # Start dev server (http://localhost:3000)
pnpm build      # Production build
pnpm test       # Run test suite
pnpm lint       # ESLint check

Backend Development

source venv/bin/activate
uvicorn backend.app:app --reload --port 8502  # Auto-reload on changes
pytest backend/tests/                          # Run tests

Docker Development

make dev        # Hot-reload for both frontend and backend
make logs       # Tail all container logs
make shell      # Access backend container shell

Testing

# Backend
pytest backend/tests/ -v

# Frontend
cd frontend-next && pnpm test

πŸ› Troubleshooting

Dashboard Won't Start

  • βœ… Verify Node.js 20+ is installed: node --version
  • βœ… Verify pnpm is installed: pnpm --version
  • βœ… Clear cache: pnpm store prune

Backend Connection Issues

  • βœ… Confirm backend is running: curl http://localhost:8502/health
  • βœ… Check OWUI_API_BASE_URL in .env
  • βœ… Verify no port conflicts: lsof -i :8502

Summarizer Failing

  • βœ… Confirm Ollama is running: ollama list
  • βœ… Verify model is available: ollama run llama3.2:latest
  • βœ… Check OLLAMA_BASE_URL in .env
  • βœ… Increase timeout: OLLAMA_TIMEOUT=300

Word Clouds Not Rendering

  • βœ… Install system fonts: sudo apt-get install fonts-dejavu
  • βœ… Restart backend after font installation

Database Locked Errors

  • βœ… Ensure only one backend instance is running
  • βœ… Check for stale lock files in database directory
  • βœ… Verify WAL mode is enabled (automatic in recent versions)

Sync Shows "Stale" Data

  • βœ… Run manual sync from βš™οΈ Configuration
  • βœ… Adjust SYNC_STALENESS_THRESHOLD_HOURS if needed
  • βœ… Enable automatic scheduler for regular updates

🀝 Contributing

Contributions are welcome! Please review AGENTS.md for:

  • Coding standards and conventions
  • Development workflow guidelines
  • Testing requirements
  • Release procedures

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

Built with:


πŸ“¬ Support


Made with ❀️ for the Open WebUI community

⬆ Back to Top

About

Chat Analyzer for OpenWebUI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors