ci: use pod-level Kubernetes resource variables#11836
Conversation
🟢 Java Benchmark SLOs — All performance SLOs passed
PR vs. master results
Commit: Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion. |
|
@codex review |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. Reviewed commit: ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
Replace per-container KUBERNETES_CPU_REQUEST / KUBERNETES_MEMORY_* with pod-level KUBERNETES_POD_* vars in both tier_m and tier_l anchors. Two native-image Gradle builds are updated to read the renamed env var for CPU parallelism sizing. Behavior changes: - Resources now budget the full pod (build + helper + init containers share a single quota) instead of only the build container, reducing the effective per-job cluster footprint. - KUBERNETES_POD_CPU_LIMIT added (no CPU limit existed before); jobs lose burst headroom in exchange for tighter scheduling isolation. Tier_m: 6 CPU / 16Gi. Tier_l: 10 CPU / 20Gi. Feature flag ci.gitlab-runner.enable-pod-level-resources (rule v3) must be enabled for this repo before merging to master. Draft PR opened for pod-spec validation on the flag-enabled branch first. Refs: CIEXE-2021, CIEXE-2150
Runner requires string type for KUBERNETES_POD_* vars. Match exact format used in reference PR924.
7e69ce2 to
dfb8bc9
Compare
Summary
Migrates dd-trace-java GitLab CI from per-container Kubernetes resource
variables to pod-level variables, as part of CIEXE-2021
(epic: CIEXE-2150).
Reference: DataDog/datadog-static-analyzer#924
What changed
KUBERNETES_CPU_REQUEST/KUBERNETES_MEMORY_REQUEST/KUBERNETES_MEMORY_LIMITin.tier_mand.tier_lanchors withKUBERNETES_POD_CPU_REQUEST,KUBERNETES_POD_CPU_LIMIT,KUBERNETES_POD_MEMORY_REQUEST,KUBERNETES_POD_MEMORY_LIMIT.quarkus-native,spring-boot-3.0-native) to read the renamed env var.Behavior changes
share one quota) instead of stacking per-container reservations.
KUBERNETES_POD_CPU_LIMITis new — no CPU limit existed before.Jobs lose burst headroom in exchange for tighter scheduling isolation.
Rollout order (flag must precede merge)
ci.gitlab-runner.enable-pod-level-resourcesrule v3 forDataDog/dd-trace-javaat https://mosaic.us1.ddbuild.io/feature-flags/ci.gitlab-runner.enable-pod-level-resources?targeting-rule=v3pod spec — confirm
spec.resourcesshows pod-level budget, containersshow empty resources.
Rollback
Revert merge commit first, then disable flag — not the other way
around. Flag-off while YAML is on master = jobs run with scheduler
defaults = OOM risk on tier_l native-image jobs.