Skip to content

Conversation

@timurcarstensen
Copy link
Collaborator

This PR removes a lot of stuff that was previously automated, namely:

  • disk-caching hf hub download calls
  • pre-downloading datasets for all tasks the user wants to eval

This allows us to remove a bunch of code and place more responsibility on the user. They must now ensure the presence of the dataset in HF_HOME if they want to eval a task no in task-groups.yaml.

We are now also able to remove a bunch of dependencies, like numpy, torch, lighteval and lm-eval which makes the tool significantly leaner and reduces import times

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +103 to 105
dataset: facebook/flores
tasks:
- task: flores200:bul_Cyrl-eng_Latn

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Cache Flores datasets per language config

The Flores task groups declare only dataset: facebook/flores at group level and no per-task subset, so _pre_download_datasets_from_specs ends up calling load_dataset without a config. For facebook/flores that means either a failure (config required) or only the default/first config is cached, leaving the language-specific configs needed by tasks such as flores200:bul_Cyrl-eng_Latn unavailable offline. Running these task groups with skip_checks=False will therefore halt during pre-download or force compute nodes to hit the network. Please specify the language-specific subsets for these tasks so each translation pair is cached ahead of time.

Useful? React with 👍 / 👎.

@geoalgo geoalgo merged commit f360643 into main Dec 18, 2025
2 checks passed
@geoalgo geoalgo deleted the remove-task-cache branch December 18, 2025 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants