Skip to content

Add MTP speculative decoding support#1027

Open
DINOZYAVIER wants to merge 9 commits into
utilityai:mainfrom
DINOZYAVIER:mtp-speculative-decoding
Open

Add MTP speculative decoding support#1027
DINOZYAVIER wants to merge 9 commits into
utilityai:mainfrom
DINOZYAVIER:mtp-speculative-decoding

Conversation

@DINOZYAVIER
Copy link
Copy Markdown

@DINOZYAVIER DINOZYAVIER commented May 19, 2026

Summary

Adds Rust bindings for llama.cpp MTP speculative decoding.

Changes

  • Updates llama.cpp to ac7680
  • Adapts wrappers for moved common APIs
  • Exposes MTP context params (n_rs_seq, context type)
  • Adds MtpSpeculative safe wrapper

Verification

  • cargo fmt --check
  • git diff --check origin/main..HEAD
  • cargo test -p llama-cpp-2

Newer llama.cpp builds publish libllama-common and libllama-common-base instead of libcommon. Detect the produced common archives before emitting cargo link directives so full release builds can resolve the MTP wrapper dependencies.
@DINOZYAVIER DINOZYAVIER force-pushed the mtp-speculative-decoding branch from 2ade832 to be3eb5b Compare May 19, 2026 23:16
@DINOZYAVIER DINOZYAVIER marked this pull request as ready for review May 19, 2026 23:23
@MarcusDunn
Copy link
Copy Markdown
Contributor

the crate version is already correct - otherwise the changes are welcome. I'll merge once version fixed and CI is clean.

DINOZYAVIER

This comment was marked as off-topic.

@DINOZYAVIER
Copy link
Copy Markdown
Author

DINOZYAVIER commented May 28, 2026

It looks like the workflow may have been cancelled due to a timeout. Before pushing, I built both targets that did not finish in CI: linux/amd64 completed fairly quickly, while arm64 took more than an hour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants