Skip to content

New connections to draining Runners #659

@mpass99

Description

@mpass99

Our current drain_on_shutdown strategy for stopping Nomad agents is:

  • On Shutdown the Nomad agent gets ineligible and no new runners are being scheduled.
  • In the drain-on-shutdown deadline all running executions have time to finish.
    • ⚡ We still start new Executions in runners on the draining agent that may not have enough time to finish
  • After that the Nomad Agent shuts down

The executions that don't have enough time to finish result in a user-visible error.

We might need to "exclude" some runners for new executions as soon as the respective Nomad agent is about to shut down.

See #651


Unfortunately, we currently don't have any metric to count how often this issue occurs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions