[Feature] Add LK loss (LK^α and LK^λ) for direct acceptance rate opti… by cicirori · Pull Request #29 · torchspec-project/TorchSpec

cicirori · 2026-03-04T02:13:05Z

Implement LK losses from "LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding" (arXiv:2602.23881), which directly optimize the acceptance rate α and improve average acceptance length by 3-8% over Forward KL on EAGLE-3.

Add loss_type and lk_eta config fields to TrainingConfig
Add compiled_lk_alpha_loss and compiled_lk_lambda_loss (+ _from_hs variants)
Dispatch loss in Eagle3Model._calculate_loss based on loss_type
Return alpha metrics from forward pass and log in trainer
Add comprehensive tests for LK losses

…mization Implement LK losses from "LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding" (arXiv:2602.23881), which directly optimize the acceptance rate α and improve average acceptance length by 3-8% over Forward KL on EAGLE-3. - Add loss_type and lk_eta config fields to TrainingConfig - Add compiled_lk_alpha_loss and compiled_lk_lambda_loss (+ _from_hs variants) - Dispatch loss in Eagle3Model._calculate_loss based on loss_type - Return alpha metrics from forward pass and log in trainer - Add comprehensive tests for LK losses

- Results: Compute sub-breakdown, 200-step stability, optimization tests - Issues: torchspec-project#27 torch.compile recompilation, torchspec-project#28 GPU Direct RDMA, torchspec-project#29 Mooncake bypass - Pending work: Updated completed items, active training tasks - Best config: no_sync + bf16 reduce → 2.7 step/s (+8%), ~3.9hr training Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Results: Compute sub-breakdown, 200-step stability, optimization tests - Issues: torchspec-project#27 torch.compile recompilation, torchspec-project#28 GPU Direct RDMA, torchspec-project#29 Mooncake bypass - Pending work: Updated completed items, active training tasks - Best config: no_sync + bf16 reduce → 2.7 step/s (+8%), ~3.9hr training

yubofredwang mentioned this pull request Mar 7, 2026

Development Roadmap (2026 Q2) #34

Open

12 tasks

cicirori closed this Mar 20, 2026

torchspec-bot deleted the feature/lk-loss branch March 21, 2026 05:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add LK loss (LK^α and LK^λ) for direct acceptance rate opti…#29

[Feature] Add LK loss (LK^α and LK^λ) for direct acceptance rate opti…#29
cicirori wants to merge 1 commit intomainfrom
feature/lk-loss

cicirori commented Mar 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cicirori commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cicirori commented Mar 4, 2026 •

edited

Loading