Skip to content

Pull requests: wandb/senpai

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Regime L: lr=2e-3, n_hidden=224, mlp_ratio=1, no noise, EMA from ep30 (radical combo) noam Noam advisor branch experiments status:wip Student is working on it student:kohaku Assigned to kohaku
#1254 opened Mar 19, 2026 by tcapelle Draft
Regime K: n_head=8, slice_num=64 (more attention heads + slices) noam Noam advisor branch experiments status:wip Student is working on it student:senku Assigned to senku
#1253 opened Mar 19, 2026 by tcapelle Draft
Regime J: mlp_ratio=4 (wider FFN at same hidden dim) noam Noam advisor branch experiments status:wip Student is working on it student:gilbert Assigned to gilbert
#1252 opened Mar 19, 2026 by tcapelle Draft
Regime I: warmup=5, cosine restart at ep40 (fast start + second phase) noam Noam advisor branch experiments status:wip Student is working on it student:violet Assigned to violet
#1251 opened Mar 19, 2026 by tcapelle Draft
Regime H: slice_num=48, n_hidden=160 (finer routing, narrower) noam Noam advisor branch experiments status:wip Student is working on it student:askeladd Assigned to askeladd
#1250 opened Mar 19, 2026 by tcapelle Draft
Regime G: Remove hard-mining, no noise, surf_weight fixed at 30 (clean training) noam Noam advisor branch experiments status:wip Student is working on it student:thorfinn Assigned to thorfinn
#1249 opened Mar 19, 2026 by tcapelle Draft
Regime F: n_hidden=160, n_layers=2 (trade width for depth) noam Noam advisor branch experiments status:wip Student is working on it student:edward Assigned to edward
#1248 opened Mar 19, 2026 by tcapelle Draft
Regime E: EMA from ep25 + decay=0.997 + T_max=72 (longer EMA window) noam Noam advisor branch experiments status:wip Student is working on it student:alphonse Assigned to alphonse
#1247 opened Mar 19, 2026 by tcapelle Draft
Regime D: batch_size=8, lr=2e-3 (larger batch, adjusted LR) noam Noam advisor branch experiments status:wip Student is working on it student:nezuko Assigned to nezuko
#1246 opened Mar 19, 2026 by tcapelle Draft
Regime C: Remove Lookahead, lr=4e-3 (pure AdamW) noam Noam advisor branch experiments status:wip Student is working on it student:tanjiro Assigned to tanjiro
#1245 opened Mar 19, 2026 by tcapelle Draft
Regime B: lr=5e-3, temp clamp 0.15 at ep40 (aggressive exploration) noam Noam advisor branch experiments status:wip Student is working on it student:fern Assigned to fern
#1244 opened Mar 19, 2026 by tcapelle Draft
Regime A: lr=1.5e-3, T_max=80, warmup=15 (slower deeper convergence) noam Noam advisor branch experiments status:wip Student is working on it student:frieren Assigned to frieren
#1243 opened Mar 19, 2026 by tcapelle Draft
Seed sweep: seed=137 (frontier verification) status:review Ready for advisor review
#1220 opened Mar 19, 2026 by tcapelle Loading…
Noam lab's work
#789 opened Mar 17, 2026 by tcapelle Loading…
Skillz based flow
#457 opened Mar 16, 2026 by tcapelle Loading…
ProTip! Adding no:label will show everything without a label.