K8flex - AI-Powered Kubernetes Debug Agent

AI-powered incident response agent that receives Alertmanager webhooks and performs automated Kubernetes debugging. Learns from feedback and maintains a knowledge base for faster resolution.

Features

Automated Debugging: Gathers logs, events, pod status, services, and network policies from Kubernetes
Multi-LLM Support: Ollama (self-hosted), OpenAI, Anthropic Claude, Google Gemini, or AWS Bedrock
Real-Time Streaming: Analysis streams progressively to Slack as it develops
Learning System: Rate analyses with ✅/❌ in Slack; system learns from feedback
Knowledge Base (Optional): PostgreSQL + pgvector for semantic search of past incidents
Slack Integration: Threaded conversations with historical context links

Quick Start

1. Choose LLM Provider

Ollama (Self-hosted)

export LLM_PROVIDER=ollama
export OLLAMA_URL=http://ollama.ollama.svc.cluster.local:11434
export OLLAMA_MODEL=llama3

OpenAI / Claude / Gemini

export LLM_PROVIDER=openai  # or anthropic, gemini, bedrock
export OPENAI_API_KEY=sk-...
export OPENAI_MODEL=gpt-4-turbo-preview

See LLM_PROVIDERS.md for all options.

2. Deploy

docker build -t k8flex-agent:latest .
kubectl apply -f k8s/deployment.yaml

3. Configure Alertmanager

receivers:
  - name: 'k8flex-ai-debug'
    webhook_configs:
      - url: 'http://k8flex-agent.k8flex.svc.cluster.local:8080/webhook'

Full setup: INTEGRATION.md

4. Optional: Slack Integration

export SLACK_BOT_TOKEN=xoxb-...
export SLACK_CHANNEL_ID=C01234567

Required scopes: chat:write, chat:write.public, reactions:read
Details: SLACK_SETUP.md

5. Optional: Knowledge Base

export KB_ENABLED=true
export KB_DATABASE_URL="postgresql://user:pass@host:5432/k8flex"
export KB_EMBEDDING_PROVIDER=openai

Setup: KNOWLEDGE_BASE.md

How It Works

Alertmanager sends webhook → K8flex receives alert
AI categorizes alert type (pod/service/node/network/resource)
System searches knowledge base for similar past cases (if enabled)
Gathers targeted Kubernetes debug information
AI analyzes and streams results to Slack in real-time
Users rate analysis with ✅/❌ reactions
Validated solutions stored for future incidents

Configuration

Key Environment Variables

Variable	Default	Description
`LLM_PROVIDER`	`ollama`	`ollama`, `openai`, `anthropic`, `gemini`, `bedrock`
`OLLAMA_URL`	`http://ollama.ollama.svc.cluster.local:11434`	Ollama endpoint
`OLLAMA_MODEL`	`llama3`	Model name
`OPENAI_API_KEY`	-	OpenAI API key
`SLACK_BOT_TOKEN`	-	Slack bot token (for advanced features)
`SLACK_CHANNEL_ID`	-	Slack channel ID
`KB_ENABLED`	`false`	Enable knowledge base
`KB_DATABASE_URL`	-	PostgreSQL connection string
`WEBHOOK_AUTH_TOKEN`	-	Webhook authentication token

Full reference: See Complete Configuration Reference section below or Configuration Documentation.

Slack Scopes Required

For feedback system and threading:

chat:write - Post messages
chat:write.public - Post to public channels
reactions:read - Detect emoji reactions

Alert Requirements

Alerts must include these labels:

namespace (required): Kubernetes namespace
pod (optional): Pod name
service (optional): Service name
alertname: Alert identifier
severity: Alert severity

Example:

- alert: PodNotReady
  expr: kube_pod_status_phase{phase!="Running"} == 1
  labels:
    namespace: "{{ $labels.namespace }}"
    pod: "{{ $labels.pod }}"
    severity: warning

Development

# Local testing
go run main.go

# Test webhook
curl -XPOST 'http://localhost:8080/webhook' \
  -H 'Content-Type: application/json' \
  -d @test-alert.json

# Build
go build -o k8flex-agent .

Documentation

INTEGRATION.md - Alertmanager/Prometheus setup
ARCHITECTURE.md - Complete architecture and workflow
QUICKSTART.md - Quick reference and examples
USE_CASES.md - Use cases, benefits, and best practices
LLM_PROVIDERS.md - All LLM provider configs
SLACK_SETUP.md - Slack bot configuration
FEEDBACK.md - Feedback system details
KNOWLEDGE_BASE.md - Vector database setup
WEBHOOK_SECURITY.md - Webhook authentication

Complete Configuration Reference

Click to expand full environment variable list

Variable	Default	Description
`PORT`	`8080`	HTTP server port
`LLM_PROVIDER`	`ollama`	LLM provider
`OLLAMA_URL`	`http://ollama.ollama.svc.cluster.local:11434`	Ollama endpoint
`OLLAMA_MODEL`	`llama3`	Ollama model
`OPENAI_API_KEY`	-	OpenAI API key
`OPENAI_MODEL`	`gpt-4-turbo-preview`	OpenAI model
`ANTHROPIC_API_KEY`	-	Anthropic API key
`ANTHROPIC_MODEL`	`claude-3-5-sonnet-20241022`	Anthropic model
`GEMINI_API_KEY`	-	Gemini API key
`GEMINI_MODEL`	`gemini-1.5-pro`	Gemini model
`BEDROCK_REGION`	`us-east-1`	AWS region
`BEDROCK_MODEL`	`anthropic.claude-3-5-sonnet-20241022-v2:0`	Bedrock model ARN
`SLACK_WEBHOOK_URL`	-	Slack webhook (basic)
`SLACK_BOT_TOKEN`	-	Slack bot token (advanced)
`SLACK_CHANNEL_ID`	-	Slack channel ID
`SLACK_WORKSPACE_ID`	-	Workspace ID for thread links
`WEBHOOK_AUTH_TOKEN`	-	Webhook auth token
`KB_ENABLED`	`false`	Enable knowledge base
`KB_DATABASE_URL`	-	PostgreSQL URL
`KB_EMBEDDING_PROVIDER`	`openai`	`openai` or `gemini`
`KB_EMBEDDING_API_KEY`	-	Embedding API key
`KB_EMBEDDING_MODEL`	`text-embedding-3-small`	Embedding model
`KB_SIMILARITY_THRESHOLD`	`0.75`	Similarity threshold (0-1)
`KB_MAX_RESULTS`	`5`	Max similar cases

License

MIT

Contributing

Contributions welcome! Ensure parameters are extracted from alert labels.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
cmd/k8flex		cmd/k8flex
deployments/migrations		deployments/migrations
docs		docs
examples		examples
helm/k8flex		helm/k8flex
internal		internal
k8s		k8s
pkg		pkg
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
helmfile.yaml		helmfile.yaml
main.go		main.go
main.go.old		main.go.old
test-alert.json		test-alert.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

K8flex - AI-Powered Kubernetes Debug Agent

Features

Quick Start

1. Choose LLM Provider

2. Deploy

3. Configure Alertmanager

4. Optional: Slack Integration

5. Optional: Knowledge Base

How It Works

Configuration

Key Environment Variables

Slack Scopes Required

Alert Requirements

Development

Documentation

Complete Configuration Reference

License

Contributing

About

Uh oh!

Releases

Packages

Languages

License

valentinpelus/k8flex

Folders and files

Latest commit

History

Repository files navigation

K8flex - AI-Powered Kubernetes Debug Agent

Features

Quick Start

1. Choose LLM Provider

2. Deploy

3. Configure Alertmanager

4. Optional: Slack Integration

5. Optional: Knowledge Base

How It Works

Configuration

Key Environment Variables

Slack Scopes Required

Alert Requirements

Development

Documentation

Complete Configuration Reference

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages