Skip to content

[PRODENG-3342] Validate --pod-cidr does not overlap Swarm overlay address pool#636

Open
james-nesbitt wants to merge 1 commit into
mainfrom
PRODENG-3342
Open

[PRODENG-3342] Validate --pod-cidr does not overlap Swarm overlay address pool#636
james-nesbitt wants to merge 1 commit into
mainfrom
PRODENG-3342

Conversation

@james-nesbitt
Copy link
Copy Markdown
Collaborator

@james-nesbitt james-nesbitt commented May 27, 2026

What

Add early validation that rejects configs where --pod-cidr in mke.installFlags overlaps with the Swarm overlay address pool.

Why

A --pod-cidr that overlaps 10.0.0.0/8 (Swarm's default overlay pool) causes the Docker daemon to restart into a broken network state during MKE bootstrap. The SSH session rides that network, so the connection drops silently and launchpad reports a 20-minute timeout with no useful diagnosis.

How

  • Added validatePodCIDR() to ValidateFacts phase — parses --pod-cidr from mke.installFlags and checks for CIDR overlap with the Swarm pool
  • If mcr.swarmInstallFlags contains --default-addr-pool, that value is used as the Swarm pool instead of the compiled-in default (10.0.0.0/8)
  • Fails immediately at the Validate Facts phase with a clear, actionable error message

Testing

  • 5 new unit tests: overlap with default pool, no overlap, absent flag, custom pool no-overlap, custom pool overlap
  • make unit-test passes (24/24 packages)

Links

Checklist

  • Tests added or updated
  • Docs updated if user-visible behaviour changed
  • No debug output or dead code left in

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Configs with --pod-cidr that overlaps the Swarm overlay address pool
(default 10.0.0.0/8) cause the Docker daemon to restart into a broken
network state during MKE bootstrap. The SSH session rides that network,
so the connection drops silently and launchpad reports a timeout after
20+ minutes of install.

Adds ValidateFacts.validatePodCIDR() which:
- Parses --pod-cidr from mke.installFlags
- Uses the Swarm default pool (10.0.0.0/8) unless mcr.swarmInstallFlags
  contains --default-addr-pool, in which case that value is used instead
- Fails immediately with a clear, actionable error if the CIDRs overlap

Tests cover: overlap with default pool, no overlap, absent flag,
custom pool no overlap, custom pool overlap.

PRODENG-3342
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment on lines +222 to +246
swarmPoolStr := p.Config.Spec.MCR.SwarmInstallFlags.GetValue("--default-addr-pool")
if swarmPoolStr == "" {
swarmPoolStr = swarmDefaultAddrPool
}

_, podNet, err := net.ParseCIDR(podCIDRStr)
if err != nil {
return fmt.Errorf("%w: cannot parse --pod-cidr %q: %w", errInvalidPodCIDR, podCIDRStr, err)
}

_, swarmNet, err := net.ParseCIDR(swarmPoolStr)
if err != nil {
return fmt.Errorf("%w: cannot parse Swarm address pool %q: %w", errInvalidPodCIDR, swarmPoolStr, err)
}

if swarmNet.Contains(podNet.IP) || podNet.Contains(swarmNet.IP) {
return fmt.Errorf(
"%w: --pod-cidr %s overlaps with the Swarm overlay address pool %s; "+
"choose a non-overlapping range or set mcr.swarmInstallFlags --default-addr-pool to a non-conflicting pool",
errInvalidPodCIDR, podCIDRStr, swarmPoolStr,
)
}

log.Debugf("pod CIDR %s does not overlap with Swarm pool %s", podCIDRStr, swarmPoolStr)
return nil
Comment on lines +299 to +303
func TestValidatePodCIDRCustomSwarmPoolOverlap(t *testing.T) {
// Custom pool 192.168.0.0/16 overlaps with pod-cidr 192.168.1.0/24.
p := makePhaseWithPodCIDR("192.168.1.0/24", "192.168.0.0/16")
err := p.validatePodCIDR()
require.Error(t, err)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

smoke-modern Run modern smoke test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants