Regime G: Remove hard-mining, no noise, surf_weight fixed at 30 (clean training) by tcapelle · Pull Request #1249 · wandb/senpai

tcapelle · 2026-03-19T08:46:26Z

Hypothesis

Ablations showed each component helps individually. But the combination of hard-mining + noise annealing + adaptive surf_weight creates a complex training landscape. Simplify: remove hard-mining, turn off noise entirely, use fixed high surf_weight=30.

Instructions

Remove vectorized hard-mining block. Set noise to zero always. Replace adaptive surf_weight with fixed 30.0. Run with --wandb_group regime-g.

Baseline (verified frontier, 4 consecutive plateau rounds)

mean3=23.2 (in=17.5, ood=14.3, re=27.7, tan=37.7)
50 single-variable experiments failed to improve. This round tests MULTI-VARIABLE regime changes.

Results

W&B run: 1uk74twg (thorfinn/regime-g-clean, group: regime-g)
Peak memory: 14.7 GB
Training: 61 epochs, 30.1 min

Surface MAE (mae_surf_p, primary metric)

Split	This run	Baseline	Delta
val_in_dist	17.7	17.5	+0.2
val_ood_cond	14.7	14.3	+0.4
val_ood_re	28.2	27.7	+0.5
val_tandem_transfer	40.9	37.7	+3.2
mean3 (in+ood+tan)/3	24.4	23.2	+1.2

Full Surface MAE breakdown (best checkpoint, epoch 61)

Split	Ux	Uy	p	val/loss
val_in_dist	10.0	2.6	17.7	0.6534
val_ood_cond	6.2	1.6	14.7	0.7779
val_ood_re	5.8	1.3	28.2	0.6027
val_tandem_transfer	8.5	2.9	40.9	1.7640

val/loss (4-split avg): 0.9495

What happened

The simplified regime is consistently slightly worse across all splits, with tandem taking the largest hit (+3.2 mae_surf_p, ~8.5% relative regression). Overall mean3 regresses 23.2 → 24.4.

Removing hard-mining likely hurt tandem most: it boosted pressure gradient signal on difficult non-tandem nodes, sharpening training in ways that aided generalization. Removing noise annealing reduced regularization. Fixed surf_weight=30 falls within the adaptive range, so that alone is unlikely to explain the regression.

The hypothesis that these components add unnecessary complexity is not supported — each carries meaningful signal.

Suggested follow-ups

Ablate surf_weight alone (fixed 30, keep noise + hard-mining) to test whether the adaptive weight is expendable.
Ablate noise only (remove noise, keep hard-mining + adaptive weight) to isolate the noise contribution.
If hard-mining is the key contributor, simplify it to a fixed top-50% fraction without tandem asymmetry — same concept, less code.

github-actions · 2026-03-19T08:46:37Z

Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.

I have read the CLA Document and I hereby sign the CLA

0 out of 2 committers have signed the CLA.
❌ @senpai-advisor
❌ @senpai-thorfinn
senpai-advisor, senpai-thorfinn seem not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You can retrigger this bot by commenting recheck in this Pull Request.}_{Posted by the CLA Assistant Lite bot.}

tcapelle · 2026-03-19T10:10:34Z

Review: Closed — Ablation Confirms Components Help

Clean training (no hard-mining, no noise, fixed surf_weight=30) regressed mean3 from 23.2 → 24.4 (+5.2%). The tandem split was hit hardest (+3.2). This confirms the merged components (hard-mining, noise annealing, adaptive surf_weight) each carry meaningful signal. The 'complex landscape' hypothesis is not supported — the complexity is earned.

Good experiment — the negative result is informative.

Experiment placeholder

e797e23

tcapelle added status:wip Student is working on it student:thorfinn Assigned to thorfinn noam Noam advisor branch experiments labels Mar 19, 2026

Regime G: remove hard-mining, no noise, fixed surf_weight=30

a4f273a

tcapelle marked this pull request as ready for review March 19, 2026 10:00

tcapelle added status:review Ready for advisor review and removed status:wip Student is working on it labels Mar 19, 2026

tcapelle closed this Mar 19, 2026

tcapelle deleted the exp-noam/regime-g branch March 19, 2026 10:10

github-actions Bot locked and limited conversation to collaborators Mar 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regime G: Remove hard-mining, no noise, surf_weight fixed at 30 (clean training)#1249

Regime G: Remove hard-mining, no noise, surf_weight fixed at 30 (clean training)#1249
tcapelle wants to merge 2 commits intonoamfrom
exp-noam/regime-g

tcapelle commented Mar 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

tcapelle commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tcapelle commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Hypothesis

Instructions

Baseline (verified frontier, 4 consecutive plateau rounds)

Results

Surface MAE (mae_surf_p, primary metric)

Full Surface MAE breakdown (best checkpoint, epoch 61)

What happened

Suggested follow-ups

Uh oh!

github-actions Bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tcapelle commented Mar 19, 2026

Review: Closed — Ablation Confirms Components Help

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tcapelle commented Mar 19, 2026 •

edited

Loading

github-actions Bot commented Mar 19, 2026 •

edited

Loading