Add comprehensive unit and GPU parity tests for core ops by rlarson20 · Pull Request #1 · rlarson20/Volta

rlarson20 · 2026-04-18T23:55:39Z

Summary

This PR significantly expands test coverage for the Volta tensor library by adding ~1000 lines of direct unit tests for foundational operations and neural network layers, plus a new GPU parity test suite to validate CPU/GPU numerical equivalence.

Key Changes

Unit Tests (in `src/lib.rs`)

Binary ops tests (binary_ops_tests): Forward value/shape and gradient checks for add, sub, elem_mul, div, max_elem, modulo, cmplt, and chained operations
Unary ops tests (unary_ops_tests): Forward correctness and gradient validation for 15+ operations including neg, recip, sqrt, exp, log, sin, cos, tanh, sigmoid, relu, erf, and exponential/logarithmic variants
Reduction ops tests (reduce_ops_tests): Tests for sum, mean, max_reduce and their axis-aware variants (sum_dim, mean_dim, max_dim) with shape and gradient correctness
Matmul tests (matmul_tests): Coverage for 2D×2D, 2D×1D, 1D×2D, 1D×1D (dot product), and batched matmul with forward value and gradient checks; includes transpose validation
Linear layer tests (linear_tests): Weight/bias shape validation, forward computation with known weights, and gradient checks
Activation layer tests (activation_tests): Forward and backward parity for ReLU, Sigmoid, and Tanh modules
Dropout tests (dropout_tests): Identity behavior at p=0, zeroing at p=1, eval-mode passthrough, shape preservation, and statistical mean preservation
BatchNorm tests (batchnorm_tests): Normalization correctness, running statistics updates, eval-mode behavior, and gradient flow for both BatchNorm1d and BatchNorm2d

GPU Parity Test Suite (new files)

tests/common/mod.rs: Shared helpers for CPU/GPU parity testing with configurable tolerances and early exit when GPU is unavailable
tests/parity_unary.rs: CPU/GPU forward+backward parity for 15+ unary operations
tests/parity_binary.rs: CPU/GPU parity for binary ops (add, sub, mul, div, max_elem, modulo, cmplt)
tests/parity_reduce.rs: CPU/GPU forward parity for reductions (sum, mean, max with axis variants)
tests/parity_matmul.rs: CPU/GPU parity across the matmul shape matrix (2D×2D, 2D×1D, 1D×2D, 1D×1D, batched)

Implementation Details

All unit tests use RawTensor::check_gradients_simple for finite-difference gradient validation, ensuring backward-pass correctness
GPU parity tests use deterministic input generation with configurable value ranges to avoid degenerate cases (e.g., log of zero, division by zero)
Tests gracefully skip when GPU is unavailable (skip_if_no_gpu() guard), allowing compilation and CI to pass on CPU-only environments
Tolerances are tuned per operation class (forward: 1e-5, backward: 1e-4 for most ops; matmul: 1e-4/1e-3 due to accumulation)
Tests follow existing inline #[cfg(test)] mod convention and use approx::assert_relative_eq for floating-point comparisons where appropriate

https://claude.ai/code/session_01BPobYZJo4z4VbKFpVFRNpj

Closes the largest coverage gaps identified in the recent analysis: - src/ops/{binary,unary,reduce,matmul}.rs had no direct unit tests - src/nn/layers/{linear,dropout,batchnorm,relu,sigmoid,tanh}.rs had no isolated tests; bugs would only surface through Sequential integration - there was no shared CPU/GPU parity harness; every device comparison was hand-written, so most ops had never been verified to match across devices Adds 8 new test modules in src/lib.rs (binary_ops_tests, unary_ops_tests, reduce_ops_tests, matmul_tests, linear_tests, activation_tests, dropout_tests, batchnorm_tests) covering forward correctness, gradient correctness via check_gradients_simple, BatchNorm running-stat updates, train/eval mode switching, and Dropout's inverted-scaling invariant. Lib test count goes from 482 to 519 (+37). Adds a new shared parity helper at tests/common/mod.rs plus four table-driven integration test files (tests/parity_{unary,binary,reduce,matmul}.rs) that run the same op on CPU and GPU copies and compare results within tolerance. Helpers handle GPU-broadcast unsupported (matched shapes only), domain-restricted inputs for log/sqrt/recip/exp, and clean early-exit when no GPU is available. https://claude.ai/code/session_01BPobYZJo4z4VbKFpVFRNpj

rlarson20 merged commit 1db2570 into main Apr 18, 2026
0 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comprehensive unit and GPU parity tests for core ops#1

Add comprehensive unit and GPU parity tests for core ops#1
rlarson20 merged 1 commit into
mainfrom
claude/analyze-test-coverage-gMiYi

rlarson20 commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rlarson20 commented Apr 18, 2026

Summary

Key Changes

Unit Tests (in src/lib.rs)

GPU Parity Test Suite (new files)

Implementation Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Unit Tests (in `src/lib.rs`)