Skip to content

Latest commit

 

History

History
439 lines (316 loc) · 15.2 KB

File metadata and controls

439 lines (316 loc) · 15.2 KB

LiteLLM-X-Server-Config

A self-hosted LLM proxy stack built around LiteLLM, CLIProxyAPI, PostgreSQL, Netdata, Traefik, and optional Cloudflare Tunnel. It provides centralized API key management, model routing, access-group control, Claude Code request validation, and optional monitoring for a Docker Swarm deployment managed through Portainer.

Architecture

This repository is deployed as multiple Docker Swarm stacks:

Infrastructure stack (portainer/portainer.yaml)

  • traefik: Shared ingress proxy for Portainer, LiteLLM, CLIProxyAPI, and Netdata
  • cloudflared: Optional Cloudflare Tunnel connector that forwards public hostnames to Traefik
  • portainer: Docker Swarm management UI
  • agent: Portainer agent for Swarm node access

Application data stack (llmproxy-data.yaml)

  • db: PostgreSQL database for LiteLLM state, usage logs, and model configuration

Application stack (llmproxy.yaml)

  • cli-proxy-api: Anthropic-compatible proxy and auth service
  • litellm: Core routing layer and LiteLLM admin UI

Monitoring stack (monitoring/netdata.yaml)

  • netdata: Host and container monitoring dashboard
  • config-generator: Sidecar that watches Docker labels and generates Netdata collector configs

Networks

  • internal: private overlay network between application services and PostgreSQL
  • public: shared overlay network for Traefik, optional cloudflared, and routed services
  • monitoring: shared overlay network used by Netdata auto-discovery

Routing modes

The stacks support two ingress modes:

  • Default: Cloudflare Tunnel forwards requests to Traefik on http://traefik:80, so Let's Encrypt is not required.
  • Optional: Traefik can terminate HTTPS itself with Let's Encrypt by deploying portainer/portainer.letsencrypt.yaml and switching router env vars from web/false to websecure/true.

Shared routing variables

Use the same routing variables in portainer/.env.example, monitoring/.env.example, and .env.example:

Variable Default When to change it
TRAEFIK_ROUTER_ENTRYPOINTS web Set to websecure when Traefik terminates HTTPS
TRAEFIK_ROUTER_TLS false Set to true when Traefik terminates HTTPS
LETSENCRYPT_RESOLVER le Change only if you use a different Traefik resolver name

Prerequisites

# Install ptctools
uv tool install ptctools --from git+https://github.com/tamntlib/ptctools.git

Installation

1. Install Portainer CE, Traefik, and optional Cloudflare Tunnel

Choose how hostnames reach Traefik

Default Cloudflare Tunnel flow:

  • Create a remotely managed tunnel in Cloudflare Zero Trust.
  • Add public hostnames such as portainer.example.com, netdata.example.com, llm.example.com, and cli-proxy-api.llm.example.com.
  • Point each public hostname at http://traefik:80.

Optional direct HTTPS with Let's Encrypt:

  • Create DNS A/AAAA records for the same hostnames and point them to your server IP.

Create the Portainer config directory on the server

ssh root@<ip> 'mkdir -p /opt/portainer'

Copy the Portainer stack files to the server

scp portainer/portainer.yaml portainer/portainer.letsencrypt.yaml root@<ip>:/opt/portainer/
scp portainer/.env.example root@<ip>:/opt/portainer/.env

SSH to server

Install Docker

https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository

Configure the Portainer stack
docker swarm init

Edit /opt/portainer/.env and set at least:

  • PORTAINER_HOST
  • CLOUDFLARE_TUNNEL_REPLICAS=1 and CLOUDFLARE_TUNNEL_TOKEN=<token> if you want the built-in cloudflared service enabled
  • the shared routing variables from Routing modes if you want values other than the defaults
Deploy Portainer with the default non-Let's Encrypt setup
docker stack deploy -c /opt/portainer/portainer.yaml portainer
Optional: enable Let's Encrypt on Traefik

If you want Traefik to terminate HTTPS itself:

  1. Set TRAEFIK_ROUTER_ENTRYPOINTS=websecure and TRAEFIK_ROUTER_TLS=true in /opt/portainer/.env.
  2. Set LETSENCRYPT_EMAIL=<email> in /opt/portainer/.env.
  3. Set the same TRAEFIK_ROUTER_ENTRYPOINTS=websecure and TRAEFIK_ROUTER_TLS=true values in monitoring/.env and .env before deploying those stacks.
  4. Deploy with the override file:
docker stack deploy -c /opt/portainer/portainer.yaml -c /opt/portainer/portainer.letsencrypt.yaml portainer

2. Deploy the monitoring stack

Deploy this first so the shared monitoring overlay network exists before the application stacks join it.

Expose the Netdata hostname

  • Cloudflare Tunnel: add netdata.example.com as a public hostname to http://traefik:80
  • Direct Let's Encrypt: add netdata.example.com as a DNS record to your server IP

Set environment variables

Copy monitoring/.env.example to monitoring/.env and fill in the values:

cp monitoring/.env.example monitoring/.env

Required environment variables:

  • NETDATA_HOST: Hostname for the Netdata dashboard
  • NETDATA_BASIC_AUTH: Basic auth credentials for Traefik

If you use Traefik-managed HTTPS, also set the shared routing variables from Routing modes.

Create configs and deploy

ptctools docker config set -n monitoring_netdata-conf -f 'monitoring/configs/netdata.conf'
ptctools docker config set -n monitoring_config-generator-script -f 'monitoring/scripts/netdata-config-generator.sh'
ptctools docker stack deploy -n monitoring -f 'monitoring/netdata.yaml' --ownership team

3. Deploy the application stacks from your local machine

Expose the application hostnames

  • Cloudflare Tunnel: add llm.example.com and cli-proxy-api.llm.example.com as public hostnames to http://traefik:80
  • Direct Let's Encrypt: add the same hostnames as DNS records to your server IP

Set environment variables

Copy .env.example to .env and fill in the values:

cp .env.example .env

Required environment variables:

  • DB_USER, DB_PASSWORD, DB_NAME: PostgreSQL credentials
  • LITELLM_HOST, LITELLM_MASTER_KEY, LITELLM_SALT_KEY: LiteLLM configuration
  • CLI_PROXY_API_HOST: CLIProxyAPI hostname

Optional environment variables used by the stack:

  • the shared routing variables from Routing modes when using Traefik-managed HTTPS
  • CLAUDE_CODE_MODELS: Comma-separated model names that should enforce Claude Code checks
  • CLAUDE_CODE_MIN_VERSION: Minimum allowed Claude Code version for those models
  • SLACK_WEBHOOK_URL: LiteLLM Slack webhook

Upload configs and deploy

export PORTAINER_URL=https://portainer.example.com
export PORTAINER_ACCESS_TOKEN=<token>

ptctools docker config set -n llmproxy_litellm-config-yaml -f 'configs/litellm.yaml' --ownership team
ptctools docker config set -n llmproxy_litellm-claude-code-hook-py -f 'configs/claude_code_hook.py' --ownership team
ptctools docker config set -n llmproxy_cli-proxy-api-config-yaml -f 'configs/cli-proxy-api.yaml' --ownership team

ptctools docker stack deploy -n llmproxy-data -f 'llmproxy-data.yaml' --ownership team
ptctools docker stack deploy -n llmproxy -f 'llmproxy.yaml' --ownership team

LiteLLM management

cd litellm_scripts

# Generate a resolved config from config.json + config.local.json
python3 gen_config.py

# Full sync of credentials, models, aliases, fallbacks, and public model hub
python3 config.py --only credentials,models,aliases,fallbacks,public_model_hub --force --prune

# Sync specific components
python3 config.py --only models --force
python3 config.py --only aliases,fallbacks,public_model_hub
python3 config.py --only public_model_hub

# Create a LiteLLM user and API key
python3 create_api_key.py user@example.com
python3 create_api_key.py user@example.com --alias my-key

Required environment variables in litellm_scripts/.env:

  • LITELLM_API_KEY
  • LITELLM_BASE_URL

Configuration files

File Description
portainer/portainer.yaml Infrastructure stack with Traefik, Portainer, and optional Cloudflare Tunnel
portainer/portainer.letsencrypt.yaml Optional Traefik override that enables Let's Encrypt ACME and HTTP->HTTPS redirect
portainer/.env Environment variables for the Portainer/Traefik stack
llmproxy-data.yaml PostgreSQL Docker Swarm stack
llmproxy.yaml Application Docker Swarm stack for LiteLLM and CLIProxyAPI
monitoring/netdata.yaml Monitoring stack with Netdata and the label-watching config generator
configs/litellm.yaml LiteLLM runtime config (callbacks, DB batching, connection pool settings)
configs/cli-proxy-api.yaml CLIProxyAPI runtime config
configs/claude_code_hook.py LiteLLM callback that enforces Claude Code User-Agent and minimum version rules
litellm_scripts/config.json Base provider/model/alias/fallback/public-model-hub config
litellm_scripts/config.local.json Local overrides including API keys (gitignored, deep-merged with config.json)
litellm_scripts/config.gen.json Generated resolved config output from gen_config.py with LiteLLM-ready credential and model request bodies
.env Environment variables for the application stacks
monitoring/.env Environment variables for the monitoring stack

Local configuration (config.local.json)

Create litellm_scripts/config.local.json to add API keys and local overrides:

{
  "providers": {
    "my-provider": {
      "api_key": "sk-your-api-key-here"
    },
    "another-provider": {
      "api_key": "sk-another-key"
    }
  }
}

This file is deep-merged with config.json, so you only need to specify overrides. Provider configs can also use $extend in config.json and override or disable inheritance in config.local.json.

Interface-level api_base

Each interface may override the provider-level api_base. This is useful when a single provider exposes different OpenAI-compatible and Anthropic-compatible endpoints.

{
  "providers": {
    "my-provider": {
      "api_base": "https://shared-gateway.example.com",
      "interfaces": {
        "anthropic": {
          "api_base": "https://custom-anthropic.example.com"
        },
        "openai": {
          "api_base": "https://custom-openai.example.com/v1"
        }
      }
    }
  }
}

Rules:

  • interface-level api_base overrides the provider-level api_base for credential generation and interface-specific model discovery
  • interface-level models_api_base may be set separately when the /models endpoint lives on a different base URL
  • if interface models_api_base is omitted, model discovery falls back to interface api_base, then provider-level models_api_base, then provider-level api_base

public_model_hub and is_public_model_hub

Use public_model_hub to add explicit model groups or aliases to LiteLLM's public model hub:

{
  "public_model_hub": [
    "claude-opus-4-7"
  ]
}

Use is_public_model_hub to derive public model hub entries from config defaults:

{
  "providers": {
    "my-provider": {
      "is_public_model_hub": true,
      "interfaces": {
        "openai": {
          "models": {
            "model-a": null,
            "model-b": {
              "is_public_model_hub": false
            }
          }
        }
      }
    }
  }
}

Rules:

  • provider-level is_public_model_hub is the default for all models under that provider
  • model-level is_public_model_hub overrides the provider default
  • if is_public_model_hub is omitted, it is treated as false
  • public_model_hub entries are combined from three sources by default: derived model entries, alias names, and the explicit public_model_hub array
  • set public_model_hub_autofill_disabled: true to disable derived model entry autofill
  • set public_model_hub_aliases_autofill_disabled: true to disable alias-name autofill
  • in config.local.json, the public_model_hub array replaces the base list instead of merging element-by-element

model_name_prefix

Each interface may define model_name_prefix to control derived model group names. When omitted, it defaults to the interface name.

Default examples:

  • interfaces.anthropic.models.claude-sonnet-4-6 resolves to anthropic/claude-sonnet-4-6
  • interfaces.openai.models.gpt-5.4 resolves to openai/gpt-5.4
  • interfaces.gemini.models.gemini-2.5-pro resolves to gemini/gemini-2.5-pro
{
  "providers": {
    "my-provider": {
      "interfaces": {
        "anthropic": {
          "model_name_prefix": "anthropic",
          "models": {
            "claude-sonnet-4-6": null
          }
        }
      }
    }
  }
}

With no explicit model_name, the generated model group name becomes <model_name_prefix>/<model-id>. In the example above, claude-sonnet-4-6 resolves to anthropic/claude-sonnet-4-6.

If model_name is set on a model, it still wins and fully overrides the derived prefix-based name.

These resolved prefixed names are the ones used by generated models and should be the names you reference in:

  • aliases targets
  • fallbacks
  • public_model_hub
  • model_name_base_model_map entries when you want to key by resolved model name instead of raw provider model ID

Model-level access_groups

Individual models can override the provider-level access_groups by specifying access_groups in their model config:

{
  "providers": {
    "my-provider": {
      "access_groups": ["General"],
      "models": {
        "model-a": null,
        "model-b": {
          "access_groups": ["Premium"]
        }
      }
    }
  }
}
  • model-a inherits the provider-level access_groups: ["General"]
  • model-b uses its own access_groups: ["Premium"]

Backup and restore

# Volume backup/restore (uses Duplicati)
ptctools docker volume backup -v vol1,vol2 -o s3://mybucket
ptctools docker volume restore -i s3://mybucket/vol1
ptctools docker volume restore -v vol1 -i s3://mybucket/vol1

# Database backup/restore (uses minio/mc for S3)
ptctools docker db backup -c container_id -v db_data \
  --db-user postgres --db-name mydb -o backup.sql.gz
ptctools docker db backup -c container_id -v db_data \
  --db-user postgres --db-name mydb -o s3://mybucket/backups/db.sql.gz

ptctools docker db restore -c container_id -v db_data \
  --db-user postgres --db-name mydb -i backup.sql.gz
ptctools docker db restore -c container_id -v db_data \
  --db-user postgres --db-name mydb -i s3://mybucket/backups/db.sql.gz

Monitoring

Netdata collects host, container, and PostgreSQL metrics.

Metrics retention

Netdata limits local metrics storage to 10 GiB in monitoring/configs/netdata.conf, which provides roughly 2-4 weeks of retention depending on metric volume.

Auto-discovery

Services can self-register for PostgreSQL monitoring by adding Docker labels:

deploy:
  labels:
    - netdata.postgres.name=my_database
    - netdata.postgres.dsn=postgresql://user:pass@host:5432/dbname

networks:
  - monitoring

The service must also join the shared monitoring network so the Netdata stack can reach it.