Skip to content

Add Combined Int6 + QAT + Sliding Window submission#149

Open
pleasedontddosme wants to merge 1 commit intoopenai:mainfrom
pleasedontddosme:combined-int6-qat-slidingwindow
Open

Add Combined Int6 + QAT + Sliding Window submission#149
pleasedontddosme wants to merge 1 commit intoopenai:mainfrom
pleasedontddosme:combined-int6-qat-slidingwindow

Conversation

@pleasedontddosme
Copy link

Combines best techniques from WarmdownQuantization (#1) and SlidingWindow (#2):

  • Int6 quant, FP16 tied embeddings, Late-K passthrough
  • Batched sliding window eval (stride=64), overtone init, phase-transition resid_mix
  • Muon decoupled weight decay, AdamW for embeddings/scalars
  • Novel: QAT with STE in last 30% of training for near-zero quant penalty
  • Cosine warmdown schedule, higher Muon momentum warmup

Combines best techniques from WarmdownQuantization (openai#1) and SlidingWindow (openai#2):
- Int6 quant, FP16 tied embeddings, Late-K passthrough
- Batched sliding window eval (stride=64), overtone init, phase-transition resid_mix
- Muon decoupled weight decay, AdamW for embeddings/scalars
- Novel: QAT with STE in last 30% of training for near-zero quant penalty
- Cosine warmdown schedule, higher Muon momentum warmup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant