Skip to content

Add comprehensive unit and GPU parity tests for core ops#1

Merged
rlarson20 merged 1 commit into
mainfrom
claude/analyze-test-coverage-gMiYi
Apr 18, 2026
Merged

Add comprehensive unit and GPU parity tests for core ops#1
rlarson20 merged 1 commit into
mainfrom
claude/analyze-test-coverage-gMiYi

Conversation

@rlarson20

Copy link
Copy Markdown
Owner

Summary

This PR significantly expands test coverage for the Volta tensor library by adding ~1000 lines of direct unit tests for foundational operations and neural network layers, plus a new GPU parity test suite to validate CPU/GPU numerical equivalence.

Key Changes

Unit Tests (in src/lib.rs)

  • Binary ops tests (binary_ops_tests): Forward value/shape and gradient checks for add, sub, elem_mul, div, max_elem, modulo, cmplt, and chained operations
  • Unary ops tests (unary_ops_tests): Forward correctness and gradient validation for 15+ operations including neg, recip, sqrt, exp, log, sin, cos, tanh, sigmoid, relu, erf, and exponential/logarithmic variants
  • Reduction ops tests (reduce_ops_tests): Tests for sum, mean, max_reduce and their axis-aware variants (sum_dim, mean_dim, max_dim) with shape and gradient correctness
  • Matmul tests (matmul_tests): Coverage for 2D×2D, 2D×1D, 1D×2D, 1D×1D (dot product), and batched matmul with forward value and gradient checks; includes transpose validation
  • Linear layer tests (linear_tests): Weight/bias shape validation, forward computation with known weights, and gradient checks
  • Activation layer tests (activation_tests): Forward and backward parity for ReLU, Sigmoid, and Tanh modules
  • Dropout tests (dropout_tests): Identity behavior at p=0, zeroing at p=1, eval-mode passthrough, shape preservation, and statistical mean preservation
  • BatchNorm tests (batchnorm_tests): Normalization correctness, running statistics updates, eval-mode behavior, and gradient flow for both BatchNorm1d and BatchNorm2d

GPU Parity Test Suite (new files)

  • tests/common/mod.rs: Shared helpers for CPU/GPU parity testing with configurable tolerances and early exit when GPU is unavailable
  • tests/parity_unary.rs: CPU/GPU forward+backward parity for 15+ unary operations
  • tests/parity_binary.rs: CPU/GPU parity for binary ops (add, sub, mul, div, max_elem, modulo, cmplt)
  • tests/parity_reduce.rs: CPU/GPU forward parity for reductions (sum, mean, max with axis variants)
  • tests/parity_matmul.rs: CPU/GPU parity across the matmul shape matrix (2D×2D, 2D×1D, 1D×2D, 1D×1D, batched)

Implementation Details

  • All unit tests use RawTensor::check_gradients_simple for finite-difference gradient validation, ensuring backward-pass correctness
  • GPU parity tests use deterministic input generation with configurable value ranges to avoid degenerate cases (e.g., log of zero, division by zero)
  • Tests gracefully skip when GPU is unavailable (skip_if_no_gpu() guard), allowing compilation and CI to pass on CPU-only environments
  • Tolerances are tuned per operation class (forward: 1e-5, backward: 1e-4 for most ops; matmul: 1e-4/1e-3 due to accumulation)
  • Tests follow existing inline #[cfg(test)] mod convention and use approx::assert_relative_eq for floating-point comparisons where appropriate

https://claude.ai/code/session_01BPobYZJo4z4VbKFpVFRNpj

Closes the largest coverage gaps identified in the recent analysis:
- src/ops/{binary,unary,reduce,matmul}.rs had no direct unit tests
- src/nn/layers/{linear,dropout,batchnorm,relu,sigmoid,tanh}.rs had no
  isolated tests; bugs would only surface through Sequential integration
- there was no shared CPU/GPU parity harness; every device comparison was
  hand-written, so most ops had never been verified to match across devices

Adds 8 new test modules in src/lib.rs (binary_ops_tests, unary_ops_tests,
reduce_ops_tests, matmul_tests, linear_tests, activation_tests, dropout_tests,
batchnorm_tests) covering forward correctness, gradient correctness via
check_gradients_simple, BatchNorm running-stat updates, train/eval mode
switching, and Dropout's inverted-scaling invariant. Lib test count goes
from 482 to 519 (+37).

Adds a new shared parity helper at tests/common/mod.rs plus four table-driven
integration test files (tests/parity_{unary,binary,reduce,matmul}.rs) that
run the same op on CPU and GPU copies and compare results within tolerance.
Helpers handle GPU-broadcast unsupported (matched shapes only), domain-restricted
inputs for log/sqrt/recip/exp, and clean early-exit when no GPU is available.

https://claude.ai/code/session_01BPobYZJo4z4VbKFpVFRNpj
@rlarson20 rlarson20 merged commit 1db2570 into main Apr 18, 2026
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants