ci(codeql): advanced SHA-pinned CodeQL workflow by jackmoxley · Pull Request #13 · mootable/decimal-scaled

jackmoxley · 2026-05-23T23:13:33Z

Adds the advanced CodeQL workflow to main so code scanning can be switched from the default setup to the committed, SHA-pinned advanced workflow.

languages: rust, build-mode: none (the Rust extractor works from source — no cargo build, no feature-flag juggling across the wide tiers).
Actions SHA-pinned (checkout v4, codeql-action v4).
Triggers: push + PR on main, plus a weekly Monday 07:00 UTC drift cron.

Once merged, flip Settings -> Code security -> CodeQL analysis -> Switch to advanced; the codeql.yml/badge.svg badge then becomes available for the README.

Replaces the repo's default code-scanning setup with a committed, SHA-pinned advanced workflow (languages: rust, build-mode: none). Lands on main so the 'Switch to advanced' option in Settings -> Code security takes effect and the codeql.yml badge becomes available.

codspeed-hq · 2026-05-23T23:17:11Z

Merging this PR will degrade performance by 11.23%

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 2 improved benchmarks
❌ 11 regressed benchmarks
✅ 197 untouched benchmarks
⏩ 9 skipped benchmarks¹

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	`floor`	346.9 ns	288.6 ns	+20.21%
❌	`from_f64`	460.8 ns	548.3 ns	-15.96%
⚡	`pow_8`	924.2 ns	836.7 ns	+10.46%
❌	`B_handrolled[bound_1e18]`	385.8 ns	473.3 ns	-18.49%
❌	`B_handrolled[mid_1e15]`	384.2 ns	471.7 ns	-18.55%
❌	`B_handrolled[small_1e9]`	384.2 ns	471.7 ns	-18.55%
❌	`D_mg_magic[bound_1e18]`	361.4 ns	419.7 ns	-13.9%
❌	`D_mg_magic[mid_1e15]`	361.4 ns	419.7 ns	-13.9%
❌	`D_mg_magic[small_1e9]`	361.4 ns	419.7 ns	-13.9%
❌	`D_mg_magic[wide_1e22]`	428.1 ns	486.4 ns	-11.99%
❌	`E_production[bound_1e18]`	521.1 ns	608.6 ns	-14.38%
❌	`E_production[mid_1e15]`	519.4 ns	606.9 ns	-14.42%
❌	`E_production[small_1e9]`	519.4 ns	606.9 ns	-14.42%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing ci/codeql-advanced (3fe409b) with main (0710fcf)}

9 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

The previous `neg_twos_complement` did a two-pass shape: 1. NOT loop into out[N] (N writes). 2. `add_assign_fixed(out, [1, 0, …, 0])` (a full N-limb dependent carry chain over a second stack array, even though limbs 1..N add `0` after limb 0). At wide N the dependent add chain across every limb dominates: each overflowing_add reads the previous carry, blocking vectorisation, and the second stack array is pure overhead. Replace with a limb-0 split: - `out[0] = !a[0] + 1`, capture the carry `c0`. - If `c0 == false` (the overwhelmingly common path), limbs 1..N reduce to plain independent `!a[i]` writes — no cross-limb dependency chain, the compiler can keep them register-resident and vectorise the NOT loop. - If `c0 == true` (`a[0] == MAX`), fall back to a dependent carry-prop chain through limbs 1..N (the correct, slow path). Generic over `N`, single kernel — no per-tier copies, no LimbSize axis, no Scratch-on-Int needed. Constitution rules 1-6 hold: one generic algorithm, one named file, matcher unchanged, sizing local to width. A/B verdict (benches/micro/neg_kernel_ab.rs, 6 inputs covering tiny / half_wide / mid / high / low / carry_chain): D462 (N=24): fused_split ≈ two_pass (within ±10%, noisy) D616 (N=32): fused_split beats two_pass by 1.25-1.83x D924 (N=48): fused_split beats two_pass by 1.42-2.42x D1232 (N=64): fused_split beats two_pass by 1.54-1.63x Recovers ranks #23/#27/#28 (D616), #31 (D1232) of the bbc §8.4 wide- neg cluster; D462 (#13/#17/#19/#20) is a wash at the kernel level (any remaining gap lives in the call shape, not the kernel). Bench seam: `__bench_internals::neg_fused_split` (routed kernel), `neg_two_pass` (previous shape, reference baseline), `neg_fused_open` (single-pass dependent-chain candidate). All bit-identical, asserted before timing. Validation: 6 kernel unit tests + 785 lib tests pass. `cargo check` (default) + `cargo check --features wide,x-wide,xx-wide,macros --all-targets` both clean.

jackmoxley merged commit a9c0e9b into main May 23, 2026
7 checks passed

jackmoxley deleted the ci/codeql-advanced branch May 23, 2026 23:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(codeql): advanced SHA-pinned CodeQL workflow#13

ci(codeql): advanced SHA-pinned CodeQL workflow#13
jackmoxley merged 1 commit into
mainfrom
ci/codeql-advanced

jackmoxley commented May 23, 2026

Uh oh!

codspeed-hq Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jackmoxley commented May 23, 2026

Uh oh!

codspeed-hq Bot commented May 23, 2026

Merging this PR will degrade performance by 11.23%

Performance Changes

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant