-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Problem
Is your proposal tackling an existing problem or limitation?
- No, it's an addition
Proposal
Add regression tests for the lexical baseline models (BM25Model, TfIdfModel, EditDistanceModel, and RandomRankingModel) that run each model on a small, known benchmark task, and then assert that the resulting metrics match expected values (perhaps within a tolerance range).
The non-deterministic baseline, RandomRankingModel, should use a fixed seed to ensure reproducibility.
Proposal Characteristics
-
Type:
- New Ontology (data source for multiple tasks)
- New Task(s)
- New Model(s)
- New Metric(s)
- Other
-
Area(s) of code: paths, modules, or APIs you expect to touch
tests/
Additional Context
The current unit tests in #36 verify that the models produce outputs with correct shapes and types. The proposed regression tests would complement these by verifying that the results are correct on a known evaluation scenario. This issue was identified during the review of #36 by @Mattdl.
Implementation
- I plan to implement this in a PR
- I am proposing the idea and would like someone else to pick it up