Conversation
appleboy
commented
Feb 11, 2026
- Integrate metrics tracking into AuthHandler, OAuthHandler, DeviceService, and TokenService, recording events such as login, logout, device code operations, token refresh, and token issuance.
- Add new database methods to count active tokens and device codes for gauge metrics.
- Add a periodic background job to main.go that updates gauge metrics every 30 seconds.
- Refactor service and handler constructors to require the metrics object.
- Update integration documentation to reflect complete metrics integration across services and handlers, including config recommendations and production usage.
- Improve Gin setup by initializing with gin.New and adding explicit logger and recovery middleware.
There was a problem hiding this comment.
Pull request overview
Adds end-to-end Prometheus metrics instrumentation across the authentication and OAuth/device/token flows, including periodic gauge updates driven by DB state, and updates runtime wiring/docs accordingly.
Changes:
- Wire a
*metrics.Metricsinstance through key services/handlers and record auth/OAuth/device/token events. - Add DB counting methods and a 30s background job to refresh gauge metrics from current DB state.
- Refine Gin initialization (
gin.New) and explicitly register logger/recovery middleware.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| main.go | Pass metrics into services/handlers, switch to gin.New, and add a periodic gauge refresh job + helper. |
| internal/store/sqlite.go | Add DB aggregation methods for active tokens and device code counts used by gauge metrics. |
| internal/services/token.go | Record metrics for device-code validation, issuance, revocation, and refresh success/failure. |
| internal/services/device.go | Record metrics for device-code generation and authorization duration. |
| internal/handlers/auth.go | Record login/logout metrics and track duration on logout. |
| internal/handlers/oauth_handler.go | Record OAuth callback success/failure metrics. |
| internal/services/token_test.go | Update constructors to include the new metrics parameter (nil in tests). |
| internal/services/device_test.go | Update constructors to include the new metrics parameter (nil in tests). |
| internal/services/device_security_test.go | Update constructors to include the new metrics parameter (nil in tests). |
| internal/metrics/INTEGRATION.md | Update documentation to reflect full metrics integration and config guidance. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Count all non-expired device codes | ||
| err = s.db.Model(&models.DeviceCode{}). | ||
| Where("expires_at > ?", time.Now()). | ||
| Count(&total). | ||
| Error | ||
| if err != nil { | ||
| return 0, 0, err | ||
| } | ||
|
|
||
| // Count pending (not yet authorized) | ||
| err = s.db.Model(&models.DeviceCode{}). | ||
| Where("expires_at > ? AND authorized = ?", time.Now(), false). | ||
| Count(&pending). | ||
| Error |
There was a problem hiding this comment.
CountDeviceCodes calls time.Now() separately for the total and pending queries; using a single now := time.Now() for both would keep the time boundary consistent and avoids tiny edge-case discrepancies right at expiration.
| // Calculate session duration if available | ||
| var sessionDuration time.Duration | ||
| if createdAtUnix := session.Get(SessionLastActivity); createdAtUnix != nil { | ||
| if createdAtInt64, ok := createdAtUnix.(int64); ok { | ||
| createdAt := time.Unix(createdAtInt64, 0) | ||
| sessionDuration = time.Since(createdAt) | ||
| } | ||
| } |
There was a problem hiding this comment.
sessionDuration is computed from SessionLastActivity, but that value is updated on every request by SessionIdleTimeout, so this measures time since last request rather than total session length. If the goal is the session_duration_seconds histogram, consider storing a separate session-created timestamp (set once at login/OAuth callback) and compute duration from that; or rename the metric to reflect idle time.
| r := gin.New() | ||
| // Setup Prometheus metrics middleware (must be before other routes) | ||
| r.Use(metrics.HTTPMetricsMiddleware(prometheusMetrics)) | ||
| r.Use(gin.Logger(), gin.Recovery()) |
There was a problem hiding this comment.
Metrics middleware is always registered even when cfg.MetricsEnabled is false, which means the app still incurs Prometheus instrumentation overhead and registers collectors, despite logs/docs implying metrics can be disabled. Consider conditionally initializing/using the metrics middleware only when enabled (or use a no-op metrics implementation) so METRICS_ENABLED=false truly disables metrics.
| // Add metrics gauge update job (runs every 30 seconds) | ||
| if cfg.MetricsEnabled { | ||
| m.AddRunningJob(func(ctx context.Context) error { | ||
| ticker := time.NewTicker(30 * time.Second) | ||
| defer ticker.Stop() | ||
|
|
||
| // Update immediately on startup | ||
| updateGaugeMetrics(db, prometheusMetrics) | ||
|
|
||
| for { | ||
| select { | ||
| case <-ticker.C: | ||
| updateGaugeMetrics(db, prometheusMetrics) | ||
| case <-ctx.Done(): | ||
| return nil | ||
| } | ||
| } | ||
| }) |
There was a problem hiding this comment.
The gauge updater reads global counts from the shared DB and sets gauges on every instance. In a multi-replica deployment, Prometheus will scrape identical gauge values per instance, which can be misleading unless dashboards use max/avg instead of sum. Consider running this job only on a single leader instance (or gating via config), or document the intended PromQL aggregation pattern.
- Add metrics integration and no-op metrics implementation, allowing metrics to be conditionally enabled or disabled - Provide the option to initialize Prometheus metrics or a no-operation recorder based on configuration - Record authentication, OAuth, and token events for metrics in relevant handlers and services - Introduce periodic updates of gauge metrics at runtime to track token and device code counts - Refactor constructors and methods to accept metrics recorder parameters - Remove the metrics integration guide documentation - Add new methods to count active tokens and device codes in the database - Improve error formatting in several functions - Refactor multi-argument function calls for readability - Update tests to use no-op metrics implementations - Change Gin setup to allow custom middleware ordering Signed-off-by: appleboy <appleboy.tw@gmail.com>