Add comprehensive unit and GPU parity tests for core ops#1
Merged
Conversation
Closes the largest coverage gaps identified in the recent analysis:
- src/ops/{binary,unary,reduce,matmul}.rs had no direct unit tests
- src/nn/layers/{linear,dropout,batchnorm,relu,sigmoid,tanh}.rs had no
isolated tests; bugs would only surface through Sequential integration
- there was no shared CPU/GPU parity harness; every device comparison was
hand-written, so most ops had never been verified to match across devices
Adds 8 new test modules in src/lib.rs (binary_ops_tests, unary_ops_tests,
reduce_ops_tests, matmul_tests, linear_tests, activation_tests, dropout_tests,
batchnorm_tests) covering forward correctness, gradient correctness via
check_gradients_simple, BatchNorm running-stat updates, train/eval mode
switching, and Dropout's inverted-scaling invariant. Lib test count goes
from 482 to 519 (+37).
Adds a new shared parity helper at tests/common/mod.rs plus four table-driven
integration test files (tests/parity_{unary,binary,reduce,matmul}.rs) that
run the same op on CPU and GPU copies and compare results within tolerance.
Helpers handle GPU-broadcast unsupported (matched shapes only), domain-restricted
inputs for log/sqrt/recip/exp, and clean early-exit when no GPU is available.
https://claude.ai/code/session_01BPobYZJo4z4VbKFpVFRNpj
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR significantly expands test coverage for the Volta tensor library by adding ~1000 lines of direct unit tests for foundational operations and neural network layers, plus a new GPU parity test suite to validate CPU/GPU numerical equivalence.
Key Changes
Unit Tests (in
src/lib.rs)binary_ops_tests): Forward value/shape and gradient checks foradd,sub,elem_mul,div,max_elem,modulo,cmplt, and chained operationsunary_ops_tests): Forward correctness and gradient validation for 15+ operations includingneg,recip,sqrt,exp,log,sin,cos,tanh,sigmoid,relu,erf, and exponential/logarithmic variantsreduce_ops_tests): Tests forsum,mean,max_reduceand their axis-aware variants (sum_dim,mean_dim,max_dim) with shape and gradient correctnessmatmul_tests): Coverage for 2D×2D, 2D×1D, 1D×2D, 1D×1D (dot product), and batched matmul with forward value and gradient checks; includestransposevalidationlinear_tests): Weight/bias shape validation, forward computation with known weights, and gradient checksactivation_tests): Forward and backward parity forReLU,Sigmoid, andTanhmodulesdropout_tests): Identity behavior at p=0, zeroing at p=1, eval-mode passthrough, shape preservation, and statistical mean preservationbatchnorm_tests): Normalization correctness, running statistics updates, eval-mode behavior, and gradient flow for bothBatchNorm1dandBatchNorm2dGPU Parity Test Suite (new files)
tests/common/mod.rs: Shared helpers for CPU/GPU parity testing with configurable tolerances and early exit when GPU is unavailabletests/parity_unary.rs: CPU/GPU forward+backward parity for 15+ unary operationstests/parity_binary.rs: CPU/GPU parity for binary ops (add, sub, mul, div, max_elem, modulo, cmplt)tests/parity_reduce.rs: CPU/GPU forward parity for reductions (sum, mean, max with axis variants)tests/parity_matmul.rs: CPU/GPU parity across the matmul shape matrix (2D×2D, 2D×1D, 1D×2D, 1D×1D, batched)Implementation Details
RawTensor::check_gradients_simplefor finite-difference gradient validation, ensuring backward-pass correctnessskip_if_no_gpu()guard), allowing compilation and CI to pass on CPU-only environments#[cfg(test)] modconvention and useapprox::assert_relative_eqfor floating-point comparisons where appropriatehttps://claude.ai/code/session_01BPobYZJo4z4VbKFpVFRNpj