REF cleaner BaseTSFMSolver with richer default adaptation strategies#42
Conversation
- Add abstract model_id property; base class now caches the loaded model
across set_objective calls (no reload unless model_id changes).
- Replace abstract build_adapter with a non-abstract default that picks
the first available capability per task:
forecasting forecast_batch → windowed-embed ridge regression
classification embed_batch → pooled time_embed + LinearProbe
anomaly_detection embed_batch → forecast residuals
event_detection time_embed → causal-windowed embed + LogReg
- Add event_detection to the _Task type alias.
- Add embed() and time_embed() public methods (call embed_batch /
time_embed_batch and convert to numpy).
- Fix forecast() to convert numpy inputs to tensors before forecast_batch
and tensor outputs back to numpy before assembly.
- Fix swapped docstrings on embed_batch / time_embed_batch.
- Drop unused Callable import.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reconciles the shape unification from main (benchopt#28) with the BaseTSFMSolver refactor on this branch: - Adopt TaskType alias (renamed from _Task) and ForecastOutput shape (n_cutoffs, H, C, Q) from main; update _WindowedForecastAdapter accordingly. - Add covariates parameter to forecast_batch / forecast() from main. - Import Covariates from benchmark_utils.covariates. - Carry forward all branch additions: model_id caching, non-abstract build_adapter with default strategies, embed/time_embed methods, and event_detection task type. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
felixdivo
left a comment
There was a problem hiding this comment.
Overall, this is a super nice set of changes!
I LOVE the simplicity of the concrete solvers 👌
| forecasting | ||
| 1. forecast_batch (zero-shot, via _SolverForecastAdapter) | ||
| 2. embed_batch (windowed ridge regression, via _WindowedForecastAdapter) | ||
| classification | ||
| 1. embed_batch (flat embedding + LinearProbeAdapter) | ||
| 2. time_embed_batch (mean-pooled temporal embedding + LinearProbeAdapter) | ||
| anomaly_detection | ||
| 1. embed_batch (distance-from-mean score, via LinearProbeAdapter) | ||
| 2. forecast_batch (forecast-error score, via ForecastResidualAdapter) | ||
| event_detection | ||
| 1. time_embed_batch (per-position LogReg, via _TimeEmbedEventAdapter) | ||
| 2. embed_batch (causal-windowed LogReg, via _WindowedEventAdapter) |
There was a problem hiding this comment.
The strategy makes sense! However, this is very important behavioral documentation that is somewhat hidden. We should at least add a TODO to move this to a "how to add a model" docs at some point.
Maybe we should add such a documentation page anyway soon? Also, AGENTS.md could point to that docs page to make it easier for coding agents to add models.
There was a problem hiding this comment.
100% agree
It's a TODO at the moment, but it would definitely make sense to cover it in this PR.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…vels, docstrings Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Toto2 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ertsfm # Conflicts: # solvers/mantis.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| # it overrides the `forecast_batch` method | ||
| return type(self).forecast_batch is not BaseTSFMSolver.forecast_batch | ||
|
|
||
| def get_quantile_levels(self) -> tuple[float, ...]: |
There was a problem hiding this comment.
Would it make sense to have this be a property? I think so, and in case it is a bit of work to look this up in the model config, it could then be easily cached with @functools.cached_property.
…tion) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
felixdivo
left a comment
There was a problem hiding this comment.
Else, looks good to merge. Wonderful improvement!
Chronos2's lookup depends on self.model.quantiles (only available post-load_model), so it uses functools.cached_property; the other solvers expose static tuples via a plain property.
Summary (cf #39 )
model_idproperty + base-class model caching: the loaded model is now reused acrossset_objectivecalls and only reloaded whenmodel_idchanges (previously each call reloaded).build_adapteris no longer abstract: a non-abstract default covers all four tasks using the first available capability (see table below). Solvers only need to override it for genuinely custom adapter logic.embed()/time_embed()public methods: wrapembed_batch/time_embed_batchwith numpy conversion, making them directly usable in adapters and subclasses.event_detectionadded to the_Tasktype alias.forecast()type fix: numpy inputs are now converted to tensors beforeforecast_batch, and tensor outputs converted back to numpy before assembly (was a silent type mismatch).embed_batch/time_embed_batchcorrected;build_adapterdocstring now lists numbered fallback options instead of ambiguous→.Default adaptation strategies in
build_adapter(discussed in #31 )forecastingforecast_batch(zero-shot)embed_batch→ windowed ridge regressionclassificationembed_batch→ LinearProbetime_embed_batch→ mean-pooled + LinearProbeanomaly_detectionembed_batch→ distance-from-meanforecast_batch→ forecast-error residualsevent_detectiontime_embed_batch→ per-position LogRegembed_batch→ causal-windowed LogRegTest plan
BaseSolverdirectly and are unaffected by this PRforecast_batch+supported_tasks+model_id+load_modelshould work end-to-end for forecasting and anomaly detection without any further codemodel_idis unchanged🤖 Generated with Claude Code