feat(chapel): Wave 2 — chapel-multilocale gate (-nl 2 via gasnet+smp, #87 option A)#99
Merged
Merged
Conversation
Adds a 7th strict gate to chapel-ci.yml that exercises real multilocale execution by building Chapel 2.8.0 from source with `CHPL_COMM=gasnet` and `CHPL_LAUNCHER=smp`, then running `mass-panic --numLocales=2` against the same synthetic 2-repo corpus as `chapel-e2e`. Closes the Wave 1 gap: the stock `.deb` ships `CHPL_COMM=none` and rejects `-nl >1`, so until now the multi-locale code path had no CI coverage. The `smp` launcher and `smp` GASNet substrate let two locales run as oversubscribed local processes on a single ubuntu-22.04 runner — verification, not performance. Implementation choice — owner picked option A from issue #87: - Build from source with `CHPL_COMM=gasnet`, aggressive caching. - Not option B (chapel-multilocale .deb — none published upstream). - Not option C (self-hosted runner — no infrastructure to maintain). Cache strategy: - `$CHPL_HOME = /opt/chapel-multilocale` cached on `actions/cache@v4`. - Key stable on `${runner.os}-chapel-multilocale-2.8.0-gasnet-smp-v1`. - Bump `CHAPEL_MULTILOCALE_CACHE_GEN` env var to invalidate. - Cold build: ~30-40 min on 2-core runner. Warm restore: ~30s. - Cache eviction after 7 days idle (GitHub policy); chapel/** touches in normal repo activity keep it warm. Aggregator gate updated: - `chapel-ci-gate` now waits on 7 jobs (added `chapel-multilocale`). - `R_MULTILOCALE` env var added to the success-aggregation loop. - Doc comments updated from "six gates" → "seven gates". Acceptance criteria from #87 (partial closure): - [x] `chapel-multilocale` job defined and wired into aggregator - [ ] Job green on PR + main ≥1 merge cycle (this PR) - [ ] Added to Base ruleset `required_status_checks` (separate ruleset edit) - [ ] ~50-repo benchmark for README perf claim (separate PR; needs either a beefier runner or self-hosted CI to be meaningful) The aggregator job is the only one in the ruleset, so 7 → 7 is a no-op there: when this PR is green and merged, the aggregator just covers one more underlying job. Ruleset bump is therefore optional unless the owner wants per-gate visibility. Refs: #87
util/setchplenv.bash references ${MANPATH} unconditionally. Under
set -euo pipefail on a clean GH runner (MANPATH not exported), this
trips 'MANPATH: unbound variable' and aborts before chpl --make
even starts.
Fix: export MANPATH=${MANPATH:-} in both 'Build Chapel from source'
and 'Activate multilocale Chapel' steps.
CHANGELOG.md gets an Added-2026-06-01 entry for #99 (option A closure of #87): 7th strict chapel-ci gate, gasnet+smp single-host oversubscribed, source-built + $CHPL_HOME-cached, cross-locale verification via system-image-*.json grep. ROADMAP.adoc v3.0.0 'Multi-machine orchestration' bullet split into two: [x] single-host oversubscribed (Wave 2 landed) and [ ] cross-node gasnet/ofi over a real NIC (Wave 3, needs cluster runner).
Second cold-build attempt failed at:
Error: Please set the environment variable CHPL_LLVM to a supported value.
1) 'none' to build with minimal LLVM support
2) 'bundled' ...
3) 'system' ...
Chapel's compiler-builds tries to verify LLVM headers via
clang/Basic/Version.h before building; on Ubuntu 22.04 we don't have
LLVM dev headers installed, and we don't need them — CHPL_TARGET_COMPILER=gnu
already targets the C backend.
Setting CHPL_LLVM=none disables the LLVM backend entirely. Multilocale
GASNet+smp comms don't depend on LLVM, only on the runtime layer.
🔍 Hypatia Security ScanFindings: 96 issues detected
View findings[
{
"reason": "Action uses: dtolnay/rust-toolchain@4be9e76fd7c4901c61fb841f5599 needs attention",
"type": "unpinned_action",
"file": "e2e.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action es: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb needs attention",
"type": "unpinned_action",
"file": "e2e.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in boj-build.yml",
"type": "missing_timeout_minutes",
"file": "boj-build.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in cargo-audit.yml",
"type": "missing_timeout_minutes",
"file": "cargo-audit.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in casket-pages.yml",
"type": "missing_timeout_minutes",
"file": "casket-pages.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in casket-pages.yml",
"type": "missing_timeout_minutes",
"file": "casket-pages.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
Run #3 cold build completed successfully (~12 min, libchpllaunch.a + modules + cmake module files all generated), then the post-build sanity check failed with: Unrecognized flag: '--about' (use '-h' for help) Chapel 2.8.0 dropped 'chpl --about'. Use the canonical $CHPL_HOME/util/printchplenv --simple invocation instead — it prints KEY=value lines (so the regex anchor changes from ':\s+' to '='). Applied in both 'Build Chapel from source' and 'Activate multilocale Chapel' steps.
🔍 Hypatia Security ScanFindings: 96 issues detected
View findings[
{
"reason": "Action uses: dtolnay/rust-toolchain@4be9e76fd7c4901c61fb841f5599 needs attention",
"type": "unpinned_action",
"file": "e2e.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action es: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb needs attention",
"type": "unpinned_action",
"file": "e2e.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in boj-build.yml",
"type": "missing_timeout_minutes",
"file": "boj-build.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in cargo-audit.yml",
"type": "missing_timeout_minutes",
"file": "cargo-audit.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in casket-pages.yml",
"type": "missing_timeout_minutes",
"file": "casket-pages.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in casket-pages.yml",
"type": "missing_timeout_minutes",
"file": "casket-pages.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
Run #4 progressed past Build (Chapel 2.8.0 + bundled LLVM support compiled in ~12 min, libchpllaunch.a + modules + cmake module files all generated). Failed in Activate when: "$CHPL_HOME/util/printchplenv" --simple | grep -E '^CHPL_COMM=gasnet$' emitted zero output then exit 1 (presumably printchplenv --simple uses a different output format than KEY=value, or it ran into a chplenv caching issue). Switch to a format-independent check: Chapel's runtime layout names $CHPL_HOME/lib/<plat>/gnu/<arch>/loc-flat/comm-gasnet/smp/<tasks>/launch-smp/ per (comm, launcher, ...) variant. Existence of comm-gasnet/smp/.../launch-smp proves the runtime was built with our CHPL_COMM+CHPL_LAUNCHER settings — no chpl-flag-version dependence, no parsing format risk.
🔍 Hypatia Security ScanFindings: 96 issues detected
View findings[
{
"reason": "Action uses: dtolnay/rust-toolchain@4be9e76fd7c4901c61fb841f5599 needs attention",
"type": "unpinned_action",
"file": "e2e.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action es: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb needs attention",
"type": "unpinned_action",
"file": "e2e.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in boj-build.yml",
"type": "missing_timeout_minutes",
"file": "boj-build.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in cargo-audit.yml",
"type": "missing_timeout_minutes",
"file": "cargo-audit.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in casket-pages.yml",
"type": "missing_timeout_minutes",
"file": "casket-pages.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in casket-pages.yml",
"type": "missing_timeout_minutes",
"file": "casket-pages.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
Run #5: build COMPLETED (libchpllaunch.a + libchplmalloc.a + modules all generated successfully), then the sanity glob missed the actual runtime path: No such file or directory: /opt/chapel-multilocale/lib/*/gnu/*/loc-flat/comm-gasnet/smp/*/launch-smp The real path includes an extra tasks-* component: lib/linux64/gnu/x86_64/loc-flat/comm-gasnet/smp/fast/tasks-qthreads/launch-smp/... Glob was missing the tasks-* slot between perf-flavor and launch-smp. Switch to find -name comm-gasnet -print -quit: hierarchy-independent, survives minor-version path renames, succeeds iff the gasnet runtime variant was built.
🔍 Hypatia Security ScanFindings: 96 issues detected
View findings[
{
"reason": "Action uses: dtolnay/rust-toolchain@4be9e76fd7c4901c61fb841f5599 needs attention",
"type": "unpinned_action",
"file": "e2e.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action es: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb needs attention",
"type": "unpinned_action",
"file": "e2e.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in boj-build.yml",
"type": "missing_timeout_minutes",
"file": "boj-build.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in cargo-audit.yml",
"type": "missing_timeout_minutes",
"file": "cargo-audit.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in casket-pages.yml",
"type": "missing_timeout_minutes",
"file": "casket-pages.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in casket-pages.yml",
"type": "missing_timeout_minutes",
"file": "casket-pages.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Issue in chapel-ci.yml",
"type": "missing_timeout_minutes",
"file": "chapel-ci.yml",
"action": "flag",
"rule_module": "workflow_audit",
"severity": "medium"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
hyperpolymath
added a commit
that referenced
this pull request
Jun 2, 2026
…ump cache gen v1→v2 PR #100 surfaced a 5th Chapel-2.8.0 sharp edge from the #99 Wave 2 series: error: The runtime has not been built for this configuration. There is no runtime for 'CHPL_UNWIND=bundled' Valid options: system Root cause: cached $CHPL_HOME from #99 was built when libunwind-dev was installed (gated on cache-miss), so chpl auto-inferred CHPL_UNWIND=system. On PR #100 the cache hit, libunwind-dev install was SKIPPED, so the consumer chpl invocation auto-inferred CHPL_UNWIND=bundled — mismatch against the cached runtime, mass-panic build aborts. Three-part fix at source: 1. Pin CHPL_UNWIND=system explicitly in both Build and Activate steps (no more auto-inference, no more cache-hit/miss drift). 2. Promote libunwind-dev to always-run (split out of the cache-miss-gated Install step). Cheap to install (~1-2s apt) on every run; matches the cached configuration. 3. Bump CHAPEL_MULTILOCALE_CACHE_GEN v1→v2 to discard the inconsistent cache and force a fresh build with the explicit CHPL_UNWIND. Also includes the docs(truthfulness) audit from this branch: - README badge corrected: 402 → 782 runnable tests (cargo authoritative) - ROADMAP fly.toml [x] → [~] (file doesn't exist in this repo) - ROADMAP/Wiki '500+ repositories' → '303-repo estate (2026-04-12)' - chapel/README ~5-15%× soften to UNMEASURED ESTIMATE - README + Wiki note PA001/PA001b SARIF collapse Resolves blocking chapel-multilocale failure on PR #100.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes Wave 1 → Wave 2 transition for the optional Chapel mass-panic harness. Adds the seventh strict gate to
chapel-ci.yml, exercising real multilocale execution by building Chapel 2.8.0 from source withCHPL_COMM=gasnet+CHPL_LAUNCHER=smpand runningmass-panic --numLocales=2against the synthetic 2-repo corpus.What this closes
chapel-multilocalejob defined + wired intochapel-ci-gateaggregator (now 7-of-7 instead of 6-of-6)smplauncher +smpGASNet substrate run two locales as oversubscribed local processes on a single GH runnerToolchain choice — option A from #87
Owner picked the build-from-source path. Not option B (no upstream
chapel-multilocale-2.8.0.deb); not option C (no self-hosted runner infra).Cache strategy
$CHPL_HOME = /opt/chapel-multilocalecached viaactions/cache@v4${runner.os}-chapel-multilocale-2.8.0-gasnet-smp-v1CHAPEL_MULTILOCALE_CACHE_GENenv varchapel/**activity keeps it warmWhat this does NOT close (filed elsewhere or deferred)
required_status_checksbump — optional: the aggregator is the only gate in the ruleset, and 7 ≤ 7 inside it is a no-op. Filed as a doc note instead of a separate PR.-nl 2on a 2-core runner doesn't produce credible numbers). Tracked at chapel: Wave 2 — real multi-locale cluster validation (-nl 16+) on a non-trivial corpus #87 acceptance bullet 3.Test plan