-
Notifications
You must be signed in to change notification settings - Fork 180
Open
Labels
apiItems related to the APIItems related to the APIenhancementNew feature or requestNew feature or requestp1MediumMediumtelemetryvmcpVirtual MCP Server related issuesVirtual MCP Server related issues
Description
Depends on #3865
For stateful backends (e.g., Playwright, database connections), implement an optional keepalive mechanism to prevent backend session expiration while the corresponding vMCP session is active.
Implementation:
- Prefer the MCP spec-defined
pingprotocol request (side-effect-free, supported by all compliant servers); fall back to an explicitly configured low-cost tool only ifpingconsistently fails - Per-backend configuration:
keepalive_method: ping | tool:<name> | none; default: attemptping - Configurable interval at server level (default ≥ 5 min); jitter keepalive calls across sessions to avoid spikes
- The keepalive goroutine must acquire the session/backend lock before issuing calls to prevent races with
reinitializeBackend - Circuit-breaker: after N consecutive failures, disable keepalive for that backend and log a warning; probe again after ~30 min to re-enable without requiring full session recreation
- Disable by default for stateless backends and backends where TTL alignment already covers the session lifetime
Acceptance Criteria
- Keepalive uses
pingby default; falls back to configured tool only whenpingfails - Keepalive is disabled when
keepalive_method: noneis set - The keepalive interval is configurable and defaults to ≥ 5 minutes
- Keepalive calls across sessions are jittered to avoid synchronized spikes
- The keepalive goroutine holds the appropriate lock before calling the backend
- Keepalive failures do not surface as errors to the end user or fail the vMCP session
- After N consecutive failures, keepalive is disabled for that backend with a logged warning
- A probe after ~30 min re-enables keepalive if the backend recovers
- Keepalive is disabled by default for stateless backends
- Metrics are emitted:
keepalive_attempt_count,keepalive_success_count,keepalive_failure_count(by reason),keepalive_latency_ms,keepalive_auto_disabled_total(by reason) - Unit tests cover: ping used by default, fallback to tool, circuit breaker, re-enable after probe, metrics emitted
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
apiItems related to the APIItems related to the APIenhancementNew feature or requestNew feature or requestp1MediumMediumtelemetryvmcpVirtual MCP Server related issuesVirtual MCP Server related issues