fix: delegate offline mode control from container template to train.py by Neonkraft · Pull Request #16 · OpenEuroLLM/post-training

Neonkraft · 2026-04-29T10:10:23Z

Summary

The containerised TRL SLURM template (job_trl_container.sh.jinja) was hardcoding HF_HUB_OFFLINE=1, HF_DATASETS_OFFLINE=1, and TRANSFORMERS_OFFLINE=1 inside the container regardless of config.offline. This meant offline: false had no effect at runtime — the container always ran in offline mode and would stall if models weren't already cached.

This PR removes those hardcoded flags from the template and delegates ownership to train.py, which already reads config.offline to set these vars. An else branch is added to explicitly zero out the flags when offline: false, guarding against any stale values inherited from the container environment.

Type of change

The container SLURM template was hardcoding HF_HUB_OFFLINE=1 etc. regardless of config.offline, so offline: false had no effect inside the container and jobs would stall trying to reach the Hub at runtime. Remove the three hardcoded offline flags from job_trl_container.sh.jinja and let train.py own this: the existing if config.offline block sets them to 1, and a new else block explicitly sets them to 0 to clear any value inherited from the container environment.

Neonkraft merged commit 1787551 into main Apr 29, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: delegate offline mode control from container template to train.py#16

fix: delegate offline mode control from container template to train.py#16
Neonkraft merged 1 commit into
mainfrom
fix/offline-env-vars

Neonkraft commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Neonkraft commented Apr 29, 2026

Summary

Type of change

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant