Skip to content

Evict pod based on GRPC response instead of force check on every call#1055

Merged
efiacor merged 16 commits into
kptdev:mainfrom
Nordix:evict-pod-on-demand
Jun 22, 2026
Merged

Evict pod based on GRPC response instead of force check on every call#1055
efiacor merged 16 commits into
kptdev:mainfrom
Nordix:evict-pod-on-demand

Conversation

@kushnaidu

@kushnaidu kushnaidu commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Title

Evict pod based on GRPC response instead of force check on every call


Description

  • What changed: Removed removeUnhealthyPods from the pod cache manager event loop request path. Added an eviction channel so dead pods are removed from cache reactively when grpc calls fail with status.Code(err) == codes.Unavailable.
    EvaluateFunction now loops — evicting dead pods and with a configurable retry mechanism.

  • Why it’s needed: removeUnhealthyPods made 3 Kubernetes API calls per cached pod on every incoming request, blocking the single-goroutine event loop for ~700ms with 11 pods. This serialized all function evaluations and caused increased latency per render.

  • How it works: The event loop dispatches instantly (no health checks). When grpc reports with status.Code(err) == codes.Unavailable, the caller sends a podEvictionRequest to the event loop which removes that pod from cache and clears it from the cluster, then requests a new pod and retries. Real function errors return immediately. The periodic GC still catches any remaining stale pods.


Related Issue(s)

  • Closes/Fixes #

Type of Change

  • Bug fix
  • New feature
  • Enhancement
  • Refactor
  • Documentation
  • Tests
  • Other: ________

Checklist

  • Code follows project style guidelines
  • Self-reviewed changes
  • Tests added/updated
  • Documentation added/updated
  • All tests and gating checks pass

Testing Instructions (Optional)


Additional Notes (Optional)

  • Known issues:
  • Further improvements:
  • Review notes:

AI Disclosure

  • I have used AI in the creation of this PR.

If so, please describe how:
Microsoft Copilot to analyse the code.
Kiro to generate eviction channel code.

@kushnaidu kushnaidu requested review from a team and Copilot June 18, 2026 07:58
@netlify

netlify Bot commented Jun 18, 2026

Copy link
Copy Markdown

Deploy Preview for kpt-porch ready!

Name Link
🔨 Latest commit 69b983b
🔍 Latest deploy log https://app.netlify.com/projects/kpt-porch/deploys/6a390b03f07e3200084d02a3
😎 Deploy Preview https://deploy-preview-1055--kpt-porch.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 18, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the pod evaluator to stop proactively health-checking cached pods on every request, and instead evict pods reactively when gRPC calls fail with transient errors—improving request-path latency by avoiding repeated Kubernetes API calls.

Changes:

  • Removed removeUnhealthyPods from the pod cache manager’s per-request path.
  • Added an eviction channel so callers can request cache eviction when gRPC calls fail transiently, and updated EvaluateFunction to retry until success or context deadline.
  • Updated a pod cache manager test expectation to match the new request-path logging/behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
func/internal/podevaluator.go Adds retry+eviction logic based on gRPC status codes; adjusts default waitlist length fallback.
func/internal/podcachemanager.go Introduces eviction channel handling in the cache manager event loop to remove dead pods from cache.
func/internal/podevaluator_podcachemanager_test.go Updates expected log output to align with new request-path behavior.

Comment thread func/internal/podcachemanager.go
Comment thread func/internal/podevaluator.go Outdated
Comment thread func/internal/podevaluator.go Outdated
Comment thread func/internal/podevaluator.go
@kushnaidu kushnaidu requested a review from a team June 18, 2026 10:07
Copilot AI review requested due to automatic review settings June 18, 2026 18:14
@kushnaidu kushnaidu force-pushed the evict-pod-on-demand branch from 3bf6677 to 39526e1 Compare June 18, 2026 18:14

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment thread func/internal/podevaluator.go Outdated
Comment thread func/internal/podevaluator.go Outdated
Comment thread func/internal/podevaluator.go
@kushnaidu kushnaidu force-pushed the evict-pod-on-demand branch from 8b47a67 to 2d3347e Compare June 18, 2026 18:19
Copilot AI review requested due to automatic review settings June 18, 2026 18:21

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment thread func/internal/podevaluator.go Outdated
Comment thread func/internal/podevaluator.go
Comment thread func/internal/podevaluator.go Outdated
Copilot AI review requested due to automatic review settings June 18, 2026 18:28

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment thread func/internal/podevaluator.go Outdated
Comment thread func/internal/podevaluator.go
Comment thread func/internal/podcachemanager.go
Copilot AI review requested due to automatic review settings June 18, 2026 18:45
@kushnaidu kushnaidu force-pushed the evict-pod-on-demand branch from 95ed6c3 to 3a83275 Compare June 18, 2026 18:46
@dosubot dosubot Bot added the lgtm #ededed label Jun 19, 2026
Comment thread func/internal/podevaluator.go Outdated
kushnaidu added 15 commits June 22, 2026 11:05
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
…from function runner deployment

Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
Copilot AI review requested due to automatic review settings June 22, 2026 10:05
@kushnaidu kushnaidu dismissed stale reviews from rendre-greyling and JamesMcDermott via 5ccfb47 June 22, 2026 10:05
@kushnaidu kushnaidu force-pushed the evict-pod-on-demand branch from 6e82268 to 5ccfb47 Compare June 22, 2026 10:05

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Comment thread func/internal/podevaluator.go
Comment thread func/internal/podcachemanager_eventloop_test.go Outdated
Signed-off-by: Kushal Harish Naidu <kushal.harish.naidu@ericsson.com>
@sonarqubecloud

Copy link
Copy Markdown

@efiacor efiacor merged commit b8d69c0 into kptdev:main Jun 22, 2026
27 of 31 checks passed
@efiacor efiacor deleted the evict-pod-on-demand branch June 22, 2026 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm #ededed size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants