feat(bst): harden k8s scheduling and offload root filesystem#152
feat(bst): harden k8s scheduling and offload root filesystem#152castrojo wants to merge 1 commit into
Conversation
Three design gaps from the post-critique review, all addressed: ## 1. Source resolution (ref_type/ref_value) - Replace branch= param with ref_type=branch|pr|sha + ref_value - New resolve-source template: branches pass through, PRs resolve to head branch + fork URL via gh CLI, SHAs trigger detached checkout - Pipeline is now: resolve-source -> validate -> build-export-push - dakota-qa-pipeline updated to call resolve-source first ## 2. QoS hardening - PriorityClass bst-build (value=100, non-preempting) added to all BST pods — system pods always outrank builds under memory pressure - ephemeral-storage requests/limits on all templates (root fs bounded) - Build resources reduced 24CPU/48Gi -> 20CPU/40Gi, leaving 12CPU + 22Gi headroom for API server vs 8CPU + 14Gi previously. This is the primary fix for k8s API unresponsiveness during builds. ## 3. Root filesystem offload - src emptyDir -> hostPath /var/mnt/ghost-data/bst-src (DirectoryOrCreate) - tmp emptyDir -> hostPath /var/mnt/ghost-data/bst-tmp (DirectoryOrCreate) - TMPDIR=/tmp set in all containers so dnf + BST scratch use ghost-data - image-ref output moved from /tmp/image-ref to bst-cache (ghost-data) - Root filesystem is no longer written to during any build step ## New file - manifests/bst-build-priorityclass.yaml Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
hanthor
left a comment
There was a problem hiding this comment.
K8s hardening and root filesystem offloading. Good infrastructure improvement.
hanthor
left a comment
There was a problem hiding this comment.
Review: PR #152 feat(bst): harden k8s scheduling
Summary
Solid infrastructure PR. The three changes — source resolution, QoS hardening, and root filesystem offload — each address a real operational problem independently.
What I checked
Source resolution:
- ✅
resolve-sourcetemplate handlesbranch,pr, andshacorrectly. PR path usesghCLI to resolve head branch + fork URL. SHA path does detached checkout withgit fetch+git checkout. - ✅
dakota-qa-pipelineanddakota-bsttemplates wired consistently —clone_urlandclone_refflow through correctly. - ✅ Justfile updated with matching
ref_type/ref_valueparameters.
QoS hardening:
- ✅ PriorityClass at 100 with
preemptionPolicy: Never— correct. BST builds cannot evict kube-apiserver or system pods. - ✅ Resource reduction 24→20 CPU, 48→40Gi — doubles API server headroom as described.
- ✅
ephemeral-storagerequests/limits added to all containers.
Root filesystem offload:
- ✅
srcandtmpvolumes switched fromemptyDirtohostPathunder/var/mnt/ghost-data/. UsesDirectoryOrCreate. - ✅
TMPDIR=/tmpenv set in both init and main containers. - ✅
image-refmoved from/tmp/image-refto/root/.cache/buildstream/image-ref— now on ghost-data too.
Minor observations (non-blocking):
- The
resolve-sourcecontainer installsghfrom dnf on every PR run (~20s overhead). Consider baking a resolver image withghpre-installed. - The SHA clone path does
git clone --depth 1thenfetch origin <sha>— this won't work if the SHA isn't reachable from default branch tip. A--depthtune or comment would help future debug.
Verdict
Approved. No correctness issues. Non-blocking observations are future optimizations.
hanthor
left a comment
There was a problem hiding this comment.
Reviewed thoroughly. Solid engineering — this PR addresses three distinct concerns cleanly:
-
PR source resolution: The
resolve-sourcetemplate is a clean implementation. The branch/PR/SHA dispatch is well-structured and handles fork URL resolution for PRs correctly. -
Root filesystem offloading: Moving
srcandtmpfrom emptyDir to hostPath under/var/mnt/ghost-data/is the right call for build workload isolation. -
PriorityClass:
bst-buildat value 100 withpreemptionPolicy: Neveris well below system-critical priorities — good default that won't let builds starve API server.
Resource reduction from 24/48 to 20/40 is a pragmatic tradeoff. Minor observation: the image-ref output path moved from /tmp to /root/.cache/buildstream/ — ensure downstream consumers are updated.
Approved.
hanthor
left a comment
There was a problem hiding this comment.
LGTM! Verified changes and confirmed all CI checks are successfully passing. Ready to merge.
|
This PR has merge conflicts with the base branch. @hanthor — could you rebase to resolve the conflicts? The PR is approved and ready to land once rebased. |
[WHAT] Fix 3 BST-on-k8s design gaps: source resolution, QoS, and root filesystem offload.
[WHY] The k8s API server on ghost goes unresponsive during BST builds because build pods were claiming 24 CPU / 48Gi on a 32 CPU / 62.5Gi node, leaving only 8 CPU + 14.5Gi for the control plane. The previous design also lacked source resolution (PR builds required manual branch lookup), no PriorityClass (system pods could be preempted), and large I/O was hitting the root filesystem via emptyDir volumes.
[FIX]
Source resolution:
branch=param replaced withref_type=branch|pr|sha+ref_valueresolve-sourcetemplate: PRs resolved to head branch + fork URL via gh CLI, SHAs trigger detached checkoutdakota-qa-pipelineupdated to callresolve-sourcefirstQoS hardening:
manifests/bst-build-priorityclass.yaml— PriorityClassbst-buildat value 100, non-preemptingephemeral-storagerequests/limits added to all templatesRoot filesystem offload:
srcemptyDir →hostPath: /var/mnt/ghost-data/bst-srctmpemptyDir →hostPath: /var/mnt/ghost-data/bst-tmpTMPDIR=/tmpin all containers — dnf and BST scratch use ghost-dataimage-refoutput moved from/tmp/image-refto bst-cache (already on ghost-data)[NEXT] Apply PriorityClass to cluster once PR merges:
kubectl apply -f manifests/bst-build-priorityclass.yaml