Skip to content

[Feature] Add LK loss (LK^α and LK^λ) for direct acceptance rate opti…#29

Closed
cicirori wants to merge 1 commit intomainfrom
feature/lk-loss
Closed

[Feature] Add LK loss (LK^α and LK^λ) for direct acceptance rate opti…#29
cicirori wants to merge 1 commit intomainfrom
feature/lk-loss

Conversation

@cicirori
Copy link
Collaborator

@cicirori cicirori commented Mar 4, 2026

Implement LK losses from "LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding" (arXiv:2602.23881), which directly optimize the acceptance rate α and improve average acceptance length by 3-8% over Forward KL on EAGLE-3.

  • Add loss_type and lk_eta config fields to TrainingConfig
  • Add compiled_lk_alpha_loss and compiled_lk_lambda_loss (+ _from_hs variants)
  • Dispatch loss in Eagle3Model._calculate_loss based on loss_type
  • Return alpha metrics from forward pass and log in trainer
  • Add comprehensive tests for LK losses

…mization

Implement LK losses from "LK Losses: Direct Acceptance Rate Optimization
for Speculative Decoding" (arXiv:2602.23881), which directly optimize
the acceptance rate α and improve average acceptance length by 3-8% over
Forward KL on EAGLE-3.

- Add loss_type and lk_eta config fields to TrainingConfig
- Add compiled_lk_alpha_loss and compiled_lk_lambda_loss (+ _from_hs variants)
- Dispatch loss in Eagle3Model._calculate_loss based on loss_type
- Return alpha metrics from forward pass and log in trainer
- Add comprehensive tests for LK losses
@yubofredwang yubofredwang mentioned this pull request Mar 7, 2026
12 tasks
@cicirori cicirori closed this Mar 20, 2026
@torchspec-bot torchspec-bot deleted the feature/lk-loss branch March 21, 2026 05:12
zhubohao911 pushed a commit to zhubohao911/TorchSpec that referenced this pull request Mar 22, 2026
- Results: Compute sub-breakdown, 200-step stability, optimization tests
- Issues: torchspec-project#27 torch.compile recompilation, torchspec-project#28 GPU Direct RDMA, torchspec-project#29 Mooncake bypass
- Pending work: Updated completed items, active training tasks
- Best config: no_sync + bf16 reduce → 2.7 step/s (+8%), ~3.9hr training

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
zhubohao911 pushed a commit to zhubohao911/TorchSpec that referenced this pull request Mar 23, 2026
- Results: Compute sub-breakdown, 200-step stability, optimization tests
- Issues: torchspec-project#27 torch.compile recompilation, torchspec-project#28 GPU Direct RDMA, torchspec-project#29 Mooncake bypass
- Pending work: Updated completed items, active training tasks
- Best config: no_sync + bf16 reduce → 2.7 step/s (+8%), ~3.9hr training
zhubohao911 pushed a commit to zhubohao911/TorchSpec that referenced this pull request Mar 23, 2026
- Results: Compute sub-breakdown, 200-step stability, optimization tests
- Issues: torchspec-project#27 torch.compile recompilation, torchspec-project#28 GPU Direct RDMA, torchspec-project#29 Mooncake bypass
- Pending work: Updated completed items, active training tasks
- Best config: no_sync + bf16 reduce → 2.7 step/s (+8%), ~3.9hr training
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant