Conversation
byte-by-byte table lookup, except for the `Crc32` and `Crc64` parameter sets, which defer to the already-optimized implementations, and CRC-32/C, which uses x86 and ARM intrinsics. Generalized vectorization will be done as a followup.
|
Tagging subscribers to this area: @dotnet/area-system-io-hashing, @bartonjs, @vcsjones |
There was a problem hiding this comment.
Pull request overview
Adds support for computing CRC-32 and CRC-64 using caller-specified parameter sets, while keeping the existing optimized implementations for the default CRC-32 / CRC-64 variants (and CRC-32C via intrinsics where available). This expands System.IO.Hashing to cover additional CRC variants requested in #123164.
Changes:
- Introduce
Crc32ParameterSet/Crc64ParameterSettypes with factory creation and a small set of well-known presets (e.g., CRC-32, CRC-32C, CRC-64, CRC-64/NVME). - Add parameterized overloads/constructors on
Crc32andCrc64that accept a parameter set, and route hashing through the parameter set implementation. - Add extensive new test coverage for parameterized CRC-32/CRC-64 variants and known-answer tests for well-known parameter sets.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/libraries/System.IO.Hashing/src/System/IO/Hashing/Crc32.cs | Adds Crc32ParameterSet constructor/property and parameterized Hash/TryHash/HashToUInt32 overloads; routes computation through parameter set. |
| src/libraries/System.IO.Hashing/src/System/IO/Hashing/Crc64.cs | Adds Crc64ParameterSet constructor/property and parameterized Hash/TryHash/HashToUInt64 overloads; routes computation through parameter set. |
| src/libraries/System.IO.Hashing/src/System/IO/Hashing/Crc32ParameterSet.cs | Introduces core CRC-32 parameter set type (properties, Create, finalization, byte-order write). |
| src/libraries/System.IO.Hashing/src/System/IO/Hashing/Crc32ParameterSet.Table.cs | Implements table-based CRC-32 update for reflected and forward modes. |
| src/libraries/System.IO.Hashing/src/System/IO/Hashing/Crc32ParameterSet.WellKnown.cs | Adds well-known CRC-32 and CRC-32C parameter sets, including intrinsic-based CRC-32C path. |
| src/libraries/System.IO.Hashing/src/System/IO/Hashing/Crc64ParameterSet.cs | Introduces core CRC-64 parameter set type (properties, Create, finalization, byte-order write). |
| src/libraries/System.IO.Hashing/src/System/IO/Hashing/Crc64ParameterSet.Table.cs | Implements table-based CRC-64 update for reflected and forward modes. |
| src/libraries/System.IO.Hashing/src/System/IO/Hashing/Crc64ParameterSet.WellKnown.cs | Adds well-known CRC-64 (ECMA-182) and CRC-64/NVME parameter sets. |
| src/libraries/System.IO.Hashing/src/System.IO.Hashing.csproj | Adds new parameter set sources to the build. |
| src/libraries/System.IO.Hashing/ref/System.IO.Hashing.cs | Updates the public reference surface for new parameter set types and overloads. |
| src/libraries/System.IO.Hashing/tests/System.IO.Hashing.Tests.csproj | Includes new test sources for parameterized CRC testing. |
| src/libraries/System.IO.Hashing/tests/Crc32Tests.cs | Adds argument-null validation tests for new CRC-32 parameterSet overloads/ctor. |
| src/libraries/System.IO.Hashing/tests/Crc32ParameterSetTests.cs | Adds known-answer tests for several CRC-32 parameter sets. |
| src/libraries/System.IO.Hashing/tests/Crc32Tests_Parameterized.cs | Adds parameterized driver-based test framework for CRC-32 parameter sets. |
| src/libraries/System.IO.Hashing/tests/Crc32Tests_ParameterSet_Crc32.cs | Adds CRC-32 parameter set driver + singleton/value tests. |
| src/libraries/System.IO.Hashing/tests/Crc32Tests_ParameterSet_Crc32C.cs | Adds CRC-32C parameter set driver + singleton/value tests. |
| src/libraries/System.IO.Hashing/tests/Crc32Tests_ParameterSet_Custom.cs | Adds custom CRC-32 parameter set drivers using Create(...). |
| src/libraries/System.IO.Hashing/tests/Crc64ParameterSetTests.cs | Adds known-answer tests for several CRC-64 parameter sets. |
| src/libraries/System.IO.Hashing/tests/Crc64Tests_Parameterized.cs | Adds parameterized driver-based test framework for CRC-64 parameter sets. |
| src/libraries/System.IO.Hashing/tests/Crc64Tests_Parameterized_Crc64.cs | Adds CRC-64 parameter set driver + singleton/value tests. |
| src/libraries/System.IO.Hashing/tests/Crc64Tests_Parameterized_Custom.cs | Adds a custom CRC-64 parameter set driver using Create(...). |
| src/libraries/System.IO.Hashing/tests/Crc64Tests_Parameterized_Nvme.cs | Adds CRC-64/NVME parameter set driver + singleton/value tests. |
| [Theory] | ||
| [MemberData(nameof(TestCases))] | ||
| public void VerifyHashToUInt32(TestCase testCase) | ||
| { | ||
| var alg = new Crc64(s_parameterSet); | ||
| alg.Append(testCase.Input); | ||
| AssertEqualHashNumber(testCase.OutputHex, alg.GetCurrentHashAsUInt64(), littleEndian: s_parameterSet.ReflectValues); | ||
| AssertEqualHashNumber(testCase.OutputHex, Crc64.HashToUInt64(s_parameterSet, testCase.Input), littleEndian: s_parameterSet.ReflectValues); | ||
| } |
There was a problem hiding this comment.
Test name VerifyHashToUInt32 is misleading here: it validates GetCurrentHashAsUInt64() and Crc64.HashToUInt64(...). Rename the test method to VerifyHashToUInt64 to match what it's asserting.
| public Crc64(Crc64ParameterSet parameterSet) | ||
| : base(Size) | ||
| { | ||
| ArgumentNullException.ThrowIfNull(parameterSet); | ||
|
|
||
| ParameterSet = parameterSet; | ||
| _crc = parameterSet.InitialValue; | ||
| } |
There was a problem hiding this comment.
New APIs add ArgumentNullException.ThrowIfNull(parameterSet) in the Crc64(Crc64ParameterSet) ctor and parameterized Hash/TryHash/HashToUInt64 overloads, but there are no unit tests asserting these null checks (unlike Crc32Tests.ParameterSetIsRequired). Add coverage for Crc64 to ensure the exceptions (and bytesWritten behavior for TryHash) are validated.
| private static Crc32ParameterSet MakeCrc32CParameterSet() | ||
| { | ||
| #if NET | ||
| if (System.Runtime.Intrinsics.X86.Sse.IsSupported || System.Runtime.Intrinsics.Arm.Crc32.IsSupported) |
There was a problem hiding this comment.
MakeCrc32CParameterSet checks X86.Sse.IsSupported, but the SSE CRC32C instruction requires SSE4.2 (X86.Sse42.IsSupported). On x86 machines with SSE but without SSE4.2, this will return Crc32CParameterSet, then UpdateIntrinsic falls into the ARM path and will assert/call unsupported intrinsics. Change the capability check to use Sse42.IsSupported (and keep the ARM check as-is).
| if (System.Runtime.Intrinsics.X86.Sse.IsSupported || System.Runtime.Intrinsics.Arm.Crc32.IsSupported) | |
| if (System.Runtime.Intrinsics.X86.Sse42.IsSupported || System.Runtime.Intrinsics.Arm.Crc32.IsSupported) |
|
|
||
| foreach (byte value in remainingBytes) | ||
| { | ||
| crc = System.Runtime.Intrinsics.Arm.Crc32.Arm64.ComputeCrc32C(crc, value); |
There was a problem hiding this comment.
In the ARM fallback, the per-byte loop calls Arm.Crc32.Arm64.ComputeCrc32C(crc, value) where value is a byte. This ends up using the 64-bit (8-byte) CRC instruction with an implicit conversion to ulong, which produces an incorrect CRC (it effectively processes 8 bytes, not 1). Use Arm.Crc32.ComputeCrc32C(crc, value) for the remaining bytes.
| crc = System.Runtime.Intrinsics.Arm.Crc32.Arm64.ComputeCrc32C(crc, value); | |
| crc = System.Runtime.Intrinsics.Arm.Crc32.ComputeCrc32C(crc, value); |
| /// <summary>Computes the CRC-32 hash of the provided data, using the ITU-T V.42 / IEEE 802.3 parameters.</summary> | ||
| /// <param name="parameterSet">The parameters to use for the CRC computation.</param> | ||
| /// <param name="source">The data to hash.</param> | ||
| /// <returns>The computed CRC-32 hash.</returns> | ||
| /// <exception cref="ArgumentNullException"> | ||
| /// <paramref name="parameterSet"/> is <see langword="null"/>. | ||
| /// </exception> |
There was a problem hiding this comment.
The XML doc for HashToUInt32(Crc32ParameterSet parameterSet, ReadOnlySpan<byte> source) says it uses the ITU-T V.42 / IEEE 802.3 parameters, but this overload computes using the provided parameterSet. Update the <summary> to describe that it uses the specified parameter set (the default-parameter summary above is already correct).
Add parameterized CRC-32 and CRC-64. Currently, they are just done as
byte-by-byte table lookup, except for the
Crc32andCrc64parametersets, which defer to the already-optimized implementations, and CRC-32/C,
which uses x86 and ARM intrinsics.
Generalized vectorization will be done as a followup.
Fixes #123164.