You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On arm64, ins_Move_Extend in src/coreclr/jit/instr.cpp:2008-2011 returns INS_sxtw when srcType == TYP_INT. The sxtw Xd, Wn instruction sign-extends 32 → 64 bits, but for a TYP_INT → TYP_INT reg-reg move the upper 32 bits are irrelevant — any subsequent 32-bit operation ignores them, and any genuine int → long widening has an explicit CAST in the IR that takes a different codegen path.
// src/coreclr/jit/instr.cpp, arm64 signed branch:elseif (srcType == TYP_INT)
{
ins = INS_sxtw; // ← INS_mov would suffice
}
else
{
ins = INS_mov;
}
INS_mov (encoded as orr Wd, WZR, Wn on arm64) zero-extends the upper 32 bits — exactly what we want for an int-typed local — and is strictly cheaper than sxtw on common cores.
Repro
foreach over a custom Range enumerator whose Current is exposed as an auto-property (int Current { get; private set; }) emits the sxtw in its hot loop. See #40770 for full details.
The other callers should be reviewed before the change is merged, and SPMI diff would confirm the broader impact (and catch any callers that did rely on the sign-extension to produce correct 64-bit observation of the value without an explicit CAST).
Note
AI-assisted (Copilot CLI).
Description
On arm64,
ins_Move_Extendinsrc/coreclr/jit/instr.cpp:2008-2011returnsINS_sxtwwhensrcType == TYP_INT. Thesxtw Xd, Wninstruction sign-extends 32 → 64 bits, but for a TYP_INT → TYP_INT reg-reg move the upper 32 bits are irrelevant — any subsequent 32-bit operation ignores them, and any genuine int → long widening has an explicitCASTin the IR that takes a different codegen path.INS_mov(encoded asorr Wd, WZR, Wnon arm64) zero-extends the upper 32 bits — exactly what we want for an int-typed local — and is strictly cheaper thansxtwon common cores.Repro
foreachover a customRangeenumerator whoseCurrentis exposed as an auto-property (int Current { get; private set; }) emits thesxtwin its hot loop. See #40770 for full details.Inner loop on current
main(arm64, FullOpts):Replacing the auto-property with a plain
int Current;field eliminates thesxtwand matches the for-loop codegen exactly.Local patch + measurement
Single-line change:
ins = INS_sxtw;→ins = INS_mov;on arm64 for the TYP_INT case.Rebuilt checked arm64 JIT and re-measured the repro from #40770 (Apple M4 Max, N=100, 5M outer iterations):
foreachover Range (auto-prop)foreachover enumerator (plain field)for (int i = 1; i < n; i++)Sanity checks for
(long)int,(long)int.MinValue,-int.MinValue, etc. continue to produce correct results — the explicitCASTpath is unaffected.Caveats
ins_Move_Extendis called from multiple sites, not just STORE_LCL_VAR incodegenarm64.cpp:codegenarm64.cpp:3020— STORE_LCL_VAR (the case verified above)codegencommon.cpp:7227— return value codegenhwintrinsiccodegenarm64.cpp:2861,:2900— HW intrinsic helpersThe other callers should be reviewed before the change is merged, and SPMI diff would confirm the broader impact (and catch any callers that did rely on the sign-extension to produce correct 64-bit observation of the value without an explicit CAST).
Related
foreachRange case where this was first noticed.CINC/CSELnot emitted inside the loops instead of jumps over single instruction blocks #96380 — separate if-conversion limitation that affects similar inner-loop scenarios.cc @dotnet/jit-contrib