Skip to content

feat: add Grafana dashboard, ServiceMonitor, and monitoring compose overlay#22

Open
bornakapusta wants to merge 3 commits intomainfrom
feat/grafana-dashboard
Open

feat: add Grafana dashboard, ServiceMonitor, and monitoring compose overlay#22
bornakapusta wants to merge 3 commits intomainfrom
feat/grafana-dashboard

Conversation

@bornakapusta
Copy link
Contributor

@bornakapusta bornakapusta commented Feb 11, 2026

Closes #15

Summary

  • Add pre-built Grafana dashboard covering all 14 gatekeeperd Prometheus metrics
  • Add Helm templates for Grafana sidecar ConfigMap and Prometheus Operator ServiceMonitor
  • Add monitoring documentation and README section
  • Add docker-compose.monitoring.yml overlay for local Prometheus + Grafana stack (replaces profile-based approach)
  • Add monitoring/ directory with Prometheus scrape config and Grafana provisioning

Local monitoring

docker-compose -f docker-compose.yml -f docker-compose.monitoring.yml up -d

Dashboard screenshot

Screenshot 2026-02-11 at 22 40 08

Changes

Dashboard (dashboards/grafana-gatekeeperd.json, charts/gatekeeperd/dashboards/gatekeeperd.json)

  • Overview row: Request rate, success rate, error rate, connected relay clients (stat panels)
  • Requests row: Rate by hostname, by status code (stacked bars with color overrides), latency percentiles (p50/p95/p99), latency heatmap
  • Security row: Verification failures by verifier/reason, IP filter denials by allowlist, validation failures by validator
  • Relay row: Webhooks queued vs delivered, delivery latency percentiles, delivery errors by reason, pending webhooks by token, clients per token
  • System row: IP ranges loaded per allowlist, IP range fetch errors, forward errors by hostname/destination
  • Template variables: datasource, namespace, instance, hostname (multi-select)

Docker Compose monitoring overlay

  • docker-compose.monitoring.yml — Prometheus + Grafana services, opt-in via -f flag
  • monitoring/prometheus.yml — scrape config targeting gatekeeperd:9090
  • monitoring/grafana/provisioning/ — auto-provisions Prometheus datasource and dashboard

Helm templates

  • grafana-dashboard-configmap.yaml — ConfigMap with grafana_dashboard sidecar label, gated by grafana.dashboard.enabled
  • servicemonitor.yaml — Prometheus Operator ServiceMonitor, gated by serviceMonitor.enabled
  • values.yaml — New grafana.dashboard.* and serviceMonitor.* value blocks

Documentation

  • docs/MONITORING.md — Manual Grafana import, Helm sidecar provisioning, ServiceMonitor setup, template variables, full metrics reference table
  • README.md — Added Monitoring section linking to docs and dashboard file
  • CHANGELOG.md — Unreleased entries for all additions

Metrics coverage

All 14 metrics from internal/metrics/metrics.go are covered:

Metric Panel(s)
gatekeeper_requests_total Request Rate, Success Rate, Error Rate, by Hostname, by Status
gatekeeper_request_duration_seconds Latency (p50/p95/p99), Heatmap
gatekeeper_verification_failures_total Verification Failures
gatekeeper_validation_failures_total Validation Failures
gatekeeper_ip_filter_denied_total IP Filter Denials
gatekeeper_ip_ranges_loaded IP Ranges Loaded
gatekeeper_ip_range_fetch_errors_total IP Range Fetch Errors
gatekeeper_forward_errors_total Forward Errors
gatekeeper_relay_webhooks_queued_total Queued vs Delivered
gatekeeper_relay_webhooks_delivered_total Queued vs Delivered
gatekeeper_relay_delivery_errors_total Relay Delivery Errors
gatekeeper_relay_webhooks_pending Pending Webhooks
gatekeeper_relay_clients_connected Relay Clients Connected, Clients per Token
gatekeeper_relay_delivery_duration_seconds Relay Delivery Latency

Test plan

  • docker-compose -f docker-compose.yml -f docker-compose.monitoring.yml up -d starts all 4 containers
  • Webhook endpoint responds at localhost:8080/webhook
  • Prometheus healthy and scraping gatekeeperd target
  • Grafana healthy with provisioned Gatekeeperd dashboard
  • Dashboard panels populate with data after sending traffic
  • Import dashboards/grafana-gatekeeperd.json into Grafana manually — verify all panels render without errors
  • Verify template variables (datasource, namespace, instance, hostname) populate correctly
  • helm template with grafana.dashboard.enabled=true — verify ConfigMap renders
  • helm template with serviceMonitor.enabled=true — verify ServiceMonitor renders
  • Verify dashboards/grafana-gatekeeperd.json matches charts/gatekeeperd/dashboards/gatekeeperd.json

- Grafana dashboard JSON covering all 14 gatekeeperd Prometheus metrics
- Helm ConfigMap template for Grafana sidecar auto-provisioning
- Helm ServiceMonitor template for Prometheus Operator
- Root-level dashboards/grafana-gatekeeperd.json for non-K8s users
- Monitoring documentation (docs/MONITORING.md)
- README monitoring section
@codecov
Copy link

codecov bot commented Feb 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions
Copy link

github-actions bot commented Feb 11, 2026

Docker Images Built

Images are available for testing:

# gatekeeperd
docker pull ghcr.io/tight-line/gatekeeperd:pr-22-9e6b39f

# gatekeeper-relay
docker pull ghcr.io/tight-line/gatekeeper-relay:pr-22-9e6b39f

docker-compose.yml

GATEKEEPERD_IMAGE=ghcr.io/tight-line/gatekeeperd:pr-22-9e6b39f \
RELAY_IMAGE=ghcr.io/tight-line/gatekeeper-relay:pr-22-9e6b39f \
docker-compose --profile relay up

Helm (values override)

image:
  repository: ghcr.io/tight-line/gatekeeperd  # or gatekeeper-relay
  tag: "pr-22-9e6b39f"

Images expire ~15 days after PR closes.

Move monitoring services to a separate docker-compose.monitoring.yml
overlay file instead of using profiles in the main compose file.
This keeps the base compose focused on the app and makes monitoring
opt-in via the -f flag.
@bornakapusta bornakapusta changed the title feat: add Grafana dashboard, ServiceMonitor, and monitoring docs feat: add Grafana dashboard, ServiceMonitor, and monitoring compose overlay Feb 11, 2026
@sonarqubecloud
Copy link

@bornakapusta bornakapusta marked this pull request as ready for review February 11, 2026 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Add Grafana dashboard for monitoring

1 participant