Skip to content

[WIP] Record: Int6 QAT + BigramHash + MLP 1344 (val_bpb 1.1593)#150

Draft
yahya010 wants to merge 1 commit intoopenai:mainfrom
yahya010:submission/v12-next
Draft

[WIP] Record: Int6 QAT + BigramHash + MLP 1344 (val_bpb 1.1593)#150
yahya010 wants to merge 1 commit intoopenai:mainfrom
yahya010:submission/v12-next

Conversation

@yahya010
Copy link

Summary — Work in Progress

Builds on int6 QAT baseline with BigramHash bigram embedding:

  • BigramHash: 4096-bucket hash table injecting token-pair context
  • STE int6 QAT: Zero quant gap
  • Full int6 [-31,31] + zstd-22, MLP hidden=1344
  • 10 layers, seq2048, fp16 tied embedding, Muon 0.99
  • Sliding window eval stride=64

Mean val_bpb: 1.1593 (3 seeds, std: 0.00040)

Still iterating — will update when finalized.

10L int6 STE QAT + BigramHash bigram embedding + zstd-22, MLP 1344,
Muon 0.99, sliding window stride=64. 3-seed mean 1.1593 BPB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant