Add GraND and GREAT batch selection strategies by jainarchita · Pull Request #32 · Human-Augment-Analytics/Training-Batch-Selection

jainarchita · 2026-02-18T04:38:18Z

Implement GraND (Gradient Normed Distance) batch selection strategy that selects training samples based on gradient norm magnitudes, combining exploration (random sampling) with exploitation (high gradient norm samples)
Implement GREAT (GREedy Approximation Taylor) batch selection strategy that greedily selects samples with maximum orthogonal gradient contribution to maximize expected loss reduction
Unify the batch sampler interface so all strategies (including existing ones) use a consistent **kwargs pattern for passing model, loss function, device, and loss history

New files: trainer/batching/vision_batching/grand.py, trainer/batching/vision_batching/great.py
Modified: trainer/pipelines/vision/vision.py : unified batch sampler call interface, added GPU support and device info logging
Modified: trainer/constants.py : increased default runs to 5, added automatic CUDA device detection
Modified: trainer/constants_batch_strategy.py : registered GraND and GREAT strategies
Both strategies use a candidate pool approach (similar to MILO/CORESET) for computational efficiency, with a reduced pool size (500) for GREAT given its higher per-sample cost

jainarchita added 7 commits October 31, 2025 14:47

Initial GraND solution

5926f3c

GREAT implementation

0874dd4

Initial GraND solution

7fd67d1

GREAT implementation: resolved merge conflicts

9d53576

resolve merge conflicts

1ddafae

reduced candidate pool for faster epoch training

3cfb105

updated requirements.txt

edb96c2

jainarchita merged commit 9f2a5f3 into main Feb 26, 2026

Provide feedback