Hi!
Downstream in Gentoo, toralf reported a test failure for johntheripper-jumbo (at commit b27f951; I've checked subsequent commits and they shouldn't affect this) looking like:
Testing: skey, S/Key [MD4/MD5/SHA1/RMD160 32/64]... FAILED (cmp_all(1))
[...]
1 out of 758 tests have FAILED
* ERROR: app-crypt/johntheripper-jumbo-1.9.0_p20250703::gentoo failed (test phase):
* (no error message)
toralf hit this with trunk GCC (so to-be-16) with -O2 -pipe -march=native, where -march=native is whatever it returns (vagueness for a reason, see below) on a AMD Ryzen 9 5950X.
I couldn't reproduce that with -march=znver2 or -march=znver3 (or -march=native) on my 3950X but decided not to worry about that, because I could reproduce with -O3 -flto and just debugged that instead. It also passes with -fno-strict-aliasing.
Running the failing test under Valgrind:
/var/tmp/portage/app-crypt/johntheripper-jumbo-1.9.0_p20250703/work/john-b27f951a8e191210685c8421c90ca610cdd39dce/test # valgrind ../run/john --config=john.conf --test --format=skey==164563== Memcheck, a memory error detector
==164563== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==164563== Using Valgrind-3.26.0.GIT and LibVEX; rerun with -h for copyright info
==164563== Command: ../run/john --config=john.conf --test --format=skey
==164563==
--164563-- Warning: <14704> DW_TAG_subprogram with no DW_AT_name and no DW_AT_specification or DW_AT_abstract_origin in /var/tmp/portage/app-crypt/johntheripper-jumbo-1.9.0_p20250703/work/john-b27f951a8e191210685c8421c90ca610cdd39dce/run/john
--164563-- Warning: zero subprog, missing DW_AT_abstract_origin in DW_TAG_inlined_subroutine in /var/tmp/portage/app-crypt/johntheripper-jumbo-1.9.0_p20250703/work/john-b27f951a8e191210685c8421c90ca610cdd39dce/run/john
Benchmarking: skey, S/Key [MD4/MD5/SHA1/RMD160 32/64]... ==164563== Conditional jump or move depends on uninitialised value(s)
==164563== at 0x44C6C0A: is_key_right.part.0 (formats.c:596)
==164563== by 0x44CCFE9: UnknownInlinedFun (formats.c:1312)
==164563== by 0x44CCFE9: UnknownInlinedFun (formats.c:1329)
==164563== by 0x44CCFE9: fmt_self_test (formats.c:2079)
==164563== by 0x44C3673: benchmark_all (bench.c:882)
==164563== by 0x44DB627: UnknownInlinedFun (john.c:1684)
==164563== by 0x44DB627: main (john.c:2110)
==164563==
FAILED (cmp_all(1))
I won't bore you with the snaking through, but it ended up being that formats.o and SKEY_fmt_plug.o were entirely innocent and the issue was in ripemd.o. In the end, it was ripemd160_round, specifically its sph_dec32le_aligned call from sph_types.h at
|
static SPH_INLINE sph_u32 |
|
sph_dec32le_aligned(const void *src) |
|
{ |
|
#if SPH_LITTLE_ENDIAN |
|
return *(const sph_u32 *)src; |
.
sph_dec32le_aligned breaks C aliasing rules by reading unsigned char* as sph_u32*. You can alias anything with unsigned char*, but not the other way around.
Marking sph_u32 (and friends) with __attribute__((may_alias)) fixes the issue for me, though that's not standard C.
(Returning to toralf's environment: given the specifics of the problem in the end and where the aliasing happens (in the same TU), I'm reasonably confident it's the same problem that he hit given he wasn't using LTO.)
Hi!
Downstream in Gentoo, toralf reported a test failure for johntheripper-jumbo (at commit b27f951; I've checked subsequent commits and they shouldn't affect this) looking like:
toralf hit this with trunk GCC (so to-be-16) with
-O2 -pipe -march=native, where-march=nativeis whatever it returns (vagueness for a reason, see below) on a AMD Ryzen 9 5950X.I couldn't reproduce that with
-march=znver2or-march=znver3(or-march=native) on my 3950X but decided not to worry about that, because I could reproduce with-O3 -fltoand just debugged that instead. It also passes with-fno-strict-aliasing.Running the failing test under Valgrind:
I won't bore you with the snaking through, but it ended up being that
formats.oandSKEY_fmt_plug.owere entirely innocent and the issue was inripemd.o. In the end, it wasripemd160_round, specifically itssph_dec32le_alignedcall from sph_types.h atjohn/src/sph_types.h
Lines 1621 to 1625 in cb0c337
sph_dec32le_alignedbreaks C aliasing rules by readingunsigned char*assph_u32*. You can alias anything withunsigned char*, but not the other way around.Marking
sph_u32(and friends) with__attribute__((may_alias))fixes the issue for me, though that's not standard C.(Returning to toralf's environment: given the specifics of the problem in the end and where the aliasing happens (in the same TU), I'm reasonably confident it's the same problem that he hit given he wasn't using LTO.)