Add a Llama-like 7B training script for data ablations by epwalsh · Pull Request #505 · allenai/OLMo-core

epwalsh · 2025-12-16T18:44:51Z

This is a basic Llama-like 7B model with some changes to optimize for throughput, including:

SWA on 3 out of every 4 layers, just like Olmo 3.
FP8 linear layers with tensor-wise scaling.
Flash attention 3.

Here's a bash script to launch a quick benchmarking run:

#!/usr/bin/env bash

cmd=${1:-launch}
cluster=${2:-ai2/augusta}
run_name=${3:-speed-test-for-data-ablations}

python src/scripts/train/Llama-like-7B.py "$cmd" "$run_name" "$cluster" \
  --launch.num_nodes=1 \
  --launch.num_gpus=8 \
  --launch.priority=high \
  --launch.workspace=ai2/OLMo-pretraining-stability \
  --trainer.callbacks.wandb.enabled=false \
  --trainer.callbacks.lm_evaluator.enabled=false \
  --trainer.callbacks.downstream_evaluator.enabled=false \
  --trainer.no_checkpoints \
  --trainer.no_evals \
  --trainer.hard_stop='{unit: steps, value: 100}'

See https://beaker.org/ex/01KCM6Y89ZAD3WF4M058XEFE8Y.

tyler-romero

LGTM

epwalsh added 2 commits December 16, 2025 10:31

Add a fast 7B config for data ablations

50514f8

include instance filter

6313807

epwalsh requested review from soldni and tyler-romero December 16, 2025 18:49

tyler-romero approved these changes Dec 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a Llama-like 7B training script for data ablations#505

Add a Llama-like 7B training script for data ablations#505
epwalsh wants to merge 2 commits intomainfrom
epwalsh/data-ablation-script

epwalsh commented Dec 16, 2025 •

edited

Loading

Uh oh!

tyler-romero left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

epwalsh commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tyler-romero left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

epwalsh commented Dec 16, 2025 •

edited

Loading