Skip to content

Fix non-existent evaluation splits in lextreme #1150

@pjavanrood

Description

@pjavanrood

Describe the bug

The lextreme benchmark failed with KeyError or configuration errors because the evaluation_splits defined in the configuration (["validation", "test"]) did not match the actual available splits in the dataset for several subsets.

To Reproduce

task = "lextreme:multi_eurlex_level_1|5"

pipeline = Pipeline(
    tasks=task,
    pipeline_parameters=pipeline_params,
    evaluation_tracker=evaluation_tracker,
    model_config=model_config,
)

pipeline.evaluate()
pipeline.save_and_push_results()
pipeline.show_results()
    141 self._init_random_seeds()
--> 142 self._init_tasks_and_requests(tasks=tasks)
    144 self.model_config = model_config
    145 self.accelerator, self.parallel_context = self._init_parallelism_manager()
...
     88         available_suggested_splits = [
     89             split for split in (Split.TRAIN, Split.TEST, Split.VALIDATION) if split in self
     90         ]

KeyError: 'validation'

Expected behavior

The configuration should only reference splits that are actually available on the Hugging Face Hub for each subset.

Version info

  • OS: mac
  • Lighteval version: main (local development)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions