Add quantile_uni_extrapolate preprocessor preset#971
Conversation
Port of internal TabPFN-private #599. Linearly extrapolates quantile-transformed values past [0, 1] for inputs outside the training range instead of clamping at the boundary, to better preserve out-of-distribution information. Wired through both the CPU AdaptiveQuantileTransformer and the GPU TorchQuantileTransformer paths so the preset is GPU-eligible. CPU/GPU consistency tests parametrise the new preset across f16/f32/f64. Needed so v3 OOD-preprocessing checkpoints (which use quantile_uni_extrapolate in the regressor preproc recipe) load with the public package. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request adds a new quantile_uni_extrapolate preset to the preprocessing pipeline, enabling linear extrapolation for values outside the training range in both AdaptiveQuantileTransformer (CPU) and TorchQuantileTransformer (GPU). This is achieved by adding an extrapolate_ratio parameter and corresponding logic to the transform methods. Reviewers suggested adding input validation to ensure the extrapolate_ratio is non-negative and only used with uniform output distributions to avoid incorrect results.
| self._user_n_quantiles = n_quantiles | ||
| # Initialize parent with this, but it will be adapted in fit | ||
| super().__init__(n_quantiles=n_quantiles, subsample=subsample, **kwargs) | ||
| self.extrapolate_ratio = extrapolate_ratio |
There was a problem hiding this comment.
It is recommended to validate that extrapolate_ratio is non-negative and that it is only used when output_distribution is set to "uniform". Linear extrapolation as implemented here is not mathematically appropriate for a normal output distribution and would lead to incorrect results if accidentally configured that way.
self.extrapolate_ratio = extrapolate_ratio
if extrapolate_ratio is not None:
if extrapolate_ratio < 0:
raise ValueError("extrapolate_ratio must be non-negative.")
if kwargs.get("output_distribution", "uniform") != "uniform":
raise ValueError("extrapolate_ratio is only supported for output_distribution='uniform'.")| """ | ||
| super().__init__() | ||
| self.n_quantiles = n_quantiles | ||
| self.extrapolate_ratio = extrapolate_ratio |
There was a problem hiding this comment.
Consider adding a validation check to ensure extrapolate_ratio is non-negative. While the current presets use 1.0, explicit validation prevents potential issues if the class is used with custom configurations in the future.
| self.extrapolate_ratio = extrapolate_ratio | |
| self.extrapolate_ratio = extrapolate_ratio | |
| if extrapolate_ratio is not None and extrapolate_ratio < 0: | |
| raise ValueError("extrapolate_ratio must be non-negative.") |
Deliberate deviation from internal #599 (which placed it in the coarse n_samples//10 tier alongside quantile_uni_coarse). The preset is meant to be "the default quantile transform plus boundary extrapolation" — it should differ from quantile_uni by extrapolation ONLY, not also by a coarser quantile grid. Moving it to the n_samples//5 tier makes the default -> extrapolate swap a clean one-variable change. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- torch_quantile_transformer: compute the normalised tensor once and reuse it for both branches. (x - x_max)/range + 1 == (x - x_min)/range, so the separate norm_above was a redundant full-array recompute. Mirrors the consolidation already done on the CPU side in #599. Output-identical (consistency tests unchanged), ~halves extrapolation arithmetic on GPU. - Validation guards (per review): extrapolate_ratio must be non-negative (CPU + GPU); and only valid with output_distribution="uniform" (CPU), since linear extrapolation is not meaningful for a normal output. - Tests for both guard paths. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Thanks @gemini-code-assist — both addressed in
Added tests for both guard paths ( Also in this push, unrelated to the bot comments but worth flagging for review:
|
|
Thanks for the update, @LeoGrin. The added validation guards and the consolidation of the GPU |
- test__torch_extrapolate_ratio__rejects_negative used function-level imports (PLC0415); pytest + TorchQuantileTransformer are already imported at module top, so just use those. - ruff format on the two touched files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| extrapolate_ratio = ( | ||
| 1.0 if pconfig.name == "quantile_uni_extrapolate" else None | ||
| ) |
There was a problem hiding this comment.
Do we ever want to set this to sth other than one? I think it's useful to keep this parameter around because it's a pretty important parameter.
Summary
Adds the
quantile_uni_extrapolatepreprocessor preset: instead of clamping quantile-transformed values at the[0, 1]boundary, it linearly extrapolates past the boundary for inputs outside the training range, better preserving out-of-distribution information.Wired through both the CPU
AdaptiveQuantileTransformerand the GPUTorchQuantileTransformerpaths so the preset is GPU-eligible. CPU/GPU consistency tests parametrise the new preset across f16/f32/f64.Motivation
v3 OOD-preprocessing checkpoints use
quantile_uni_extrapolatein the regressor preprocessing recipe. Without this preset in the public package those checkpoints fail to load (pydantic ValidationErroron thePreprocessorConfig.nameliteral). This unblocks loading them with the public release.Changes (7 source + 3 test files, mirrors private #599)
preprocessing/configs.py— registerquantile_uni_extrapolatein the name literalpreprocessing/steps/adaptive_quantile_transformer.py—extrapolate_ratiologic (CPU)preprocessing/steps/reshape_feature_distribution_step.py— preset wiringpreprocessing/torch/{factory,steps,gpu_preprocessing_metadata,torch_quantile_transformer}.py— GPU pathTest plan
test_adaptive_quantile_transformer.py— 5/5 pass (incl. NaN-column & constant-feature edge cases)test_torch_quantile_transformer.pyextrapolation suite — 3/3 pass (incl.matches_cpu_on_out_of_range_inputs)Note: repo policy is "open an issue first" — happy to file/link one; opening this so the diff is reviewable.
🤖 Generated with Claude Code
Note
Medium Risk
Touches core preprocessing behavior for a new quantile preset and changes both CPU (sklearn) and GPU (torch) quantile paths, which can affect model inputs and CPU/GPU parity if edge cases slip through.
Overview
Adds a new preprocessor preset,
quantile_uni_extrapolate, that preserves out-of-distribution signal by linearly extrapolating quantile-transformed values beyond the usual[0, 1]clamp (with configurableextrapolate_ratio, defaulted to1.0for the preset).Wires this behavior through both the CPU
AdaptiveQuantileTransformer(stores per-feature train min/max and extrapolates attransformtime, skipping constant features) and the GPU pipeline (TorchQuantileTransformer+Torch*QuantileTransformerStep/factory), and marks the preset as GPU-eligible. Updates tests to cover extrapolation semantics, validation guards, and CPU/GPU consistency for the new preset.Reviewed by Cursor Bugbot for commit 1569371. Bugbot is set up for automated code reviews on this repo. Configure here.