Fix staging deploy: startup probe and disable network policy by RafaelPo · Pull Request #230 · futuresearch/futuresearch-python

RafaelPo · 2026-02-25T15:31:12Z

Summary

Add startup probe (60s budget) to handle slow Redis Sentinel cold-start connections
Increase liveness/readiness probe timeouts from 5s to 10s
Disable network policy — GKE assigns ClusterIPs from a non-RFC1918 range (34.118.x.x), breaking both DNS and Redis egress rules

Root cause

The network policy's DNS egress rule uses podSelector for kube-dns, but GKE evaluates egress against the ClusterIP (34.118.224.10), not the pod IP. This blocked all DNS resolution, preventing Redis Sentinel discovery, causing /health to hang and liveness probes to kill the pod.

Changes

deployment.yaml: Add startupProbe (12 attempts x 5s = 60s), increase liveness initialDelaySeconds to 30s, failureThreshold to 5, timeouts to 10s
values.yaml: networkPolicy.enabled: false (was true)

Deployed

Already deployed to staging from this branch — pod is 1/1 Running with health checks passing.

Test plan

Staging pod starts and passes health checks
/health returns 200
Network policy to be re-enabled in a follow-up with proper GKE service CIDR handling

🤖 Generated with Claude Code

Redis Sentinel discovery can take >5s on cold start, causing the liveness probe (5s timeout) to kill the pod before it establishes a connection. Add a startup probe (60s budget) to protect the boot phase, and increase liveness timeout to 10s with higher failure threshold. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

GKE assigns ClusterIPs from a non-RFC1918 range (34.118.x.x). The network policy's DNS egress rule uses podSelector for kube-dns pods, but GKE evaluates egress against the ClusterIP (34.118.224.10), not the pod IP. This blocks all DNS resolution from the MCP pods. Disable until we can properly handle GKE's service CIDR in egress rules (may require CIDR-based DNS rules instead of podSelector). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

RafaelPo and others added 2 commits February 25, 2026 15:18

RafaelPo merged commit a1e10eb into main Feb 25, 2026
5 checks passed

RafaelPo deleted the fix/mcp-startup-probe branch February 25, 2026 15:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix staging deploy: startup probe and disable network policy#230

Fix staging deploy: startup probe and disable network policy#230
RafaelPo merged 2 commits into
mainfrom
fix/mcp-startup-probe

RafaelPo commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

RafaelPo commented Feb 25, 2026

Summary

Root cause

Changes

Deployed

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant