[Feature] add Minimax support and Kimi parser updates by yubofredwang · Pull Request #45 · torchspec-project/TorchSpec

yubofredwang · 2026-03-19T17:05:37Z

Enable Minimax chat formatting, parsing, draft-model config, and parser-focused tests so M2.5 data and training flows work end to end. Include the related Kimi parser/template coverage updates and align the Kimi Eagle3 draft config with the intended KV-head setting, while keeping checkpoint export dtype control plus runtime env passthrough for FP8-compatible serving.

…ning Add dataset.shuffle_dataset (default True) so users can disable automatic dataset shuffling when the training data is intentionally ordered (e.g. curriculum learning, staged difficulty). Threads through both the offline preprocessing path and the online training controller epoch reload.

Test-only concern: the HF model paths were only used by test_loss_mask_cross_validation to load tokenizers for validation. Keep ChatTemplate focused on chat format metadata.

Copilot

Pull request overview

Adds end-to-end support for MiniMax-M2.5 chat formatting/parsing alongside Kimi-K2.5 parser/template updates, enabling Minimax/Kimi data flows to work through preprocessing, training, and conversion/serving utilities.

Changes:

Introduce minimax-m2 chat template + MiniMaxParser, with comprehensive unit tests and multimodal/tool-call handling.
Extend Kimi-K2.5 formatting to support expand_media_tokens=False passthrough behavior and add focused tests.
Add dataset shuffle control (shuffle_dataset) and add --dtype output casting support to the HF conversion tool (plus env passthrough for SGLang VLM cache sizing).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
torchspec/utils/env.py	Forward `SGLANG_VLM_CACHE_SIZE_MB` to Ray actors for serving/runtime configuration.
torchspec/data/template.py	Add `reference_model` metadata to templates and register the new `minimax-m2` template.
torchspec/data/preprocessing.py	Make dataset shuffling conditional on `shuffle_seed` being set.
torchspec/data/parse.py	Add `MiniMaxParser`; update thinking detection; add media-token passthrough option to Kimi parser.
torchspec/controller/training_controller.py	Add optional deterministic shuffling toggle via `shuffle_dataset` and rename dataset prep helper.
torchspec/config/train_config.py	Add `shuffle_dataset` to `DatasetConfig` so it can be configured via YAML.
tools/convert_to_hf.py	Add `--dtype` to control output weight dtype during HF conversion.
tests/test_minimax_parser.py	New unit tests covering MiniMax formatting/parsing, tools, thinking, multimodal, truncation, passthrough.
tests/test_kimi_k25_parser.py	Add tests for `expand_media_tokens=False`; remove real-tokenizer integration tests.
configs/draft_models/minimax_m25_eagle3.json	Add draft-model config for MiniMax M2.5 Eagle3.
configs/draft_models/kimi_k25_eagle3.json	Update Kimi K2.5 Eagle3 KV-head setting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

torchspec/data/parse.py

tools/convert_to_hf.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1c63185234

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

torchspec/data/parse.py

tools/convert_to_hf.py

json.loads on string arguments crashed on malformed or plain-text payloads, aborting formatting for the whole job. Catch JSONDecodeError and preserve the raw string as a fallback.

…-prune-vocab The prune-vocab path wrote raw_config from disk without updating torch_dtype, causing exported weights and config metadata to diverge when --dtype was specified.

…ect#45)

yubofredwang added 2 commits March 19, 2026 16:50

yubofredwang marked this pull request as ready for review March 19, 2026 17:48

Copilot AI review requested due to automatic review settings March 19, 2026 17:48

Copilot started reviewing on behalf of yubofredwang March 19, 2026 17:49 View session

[Refactor] move reference_model mapping from ChatTemplate to test file

4e652a3

Test-only concern: the HF model paths were only used by test_loss_mask_cross_validation to load tokenizers for validation. Keep ChatTemplate focused on chat format metadata.

Copilot AI reviewed Mar 19, 2026

View reviewed changes

torchspec/data/parse.py Show resolved Hide resolved

torchspec/data/parse.py Outdated Show resolved Hide resolved

tools/convert_to_hf.py Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Mar 19, 2026

View reviewed changes

torchspec/data/parse.py Outdated Show resolved Hide resolved

tools/convert_to_hf.py Show resolved Hide resolved

yubofredwang added 2 commits March 19, 2026 17:56

[Bug Fix] handle non-JSON tool arguments in MiniMax parser

558ba2e

json.loads on string arguments crashed on malformed or plain-text payloads, aborting formatting for the whole job. Catch JSONDecodeError and preserve the raw string as a fallback.

[Bug Fix] sync torch_dtype in config.json when --dtype is used with -…

63e6133

…-prune-vocab The prune-vocab path wrote raw_config from disk without updating torch_dtype, causing exported weights and config metadata to diverge when --dtype was specified.

Copilot AI review requested due to automatic review settings March 19, 2026 17:57

Copilot started reviewing on behalf of yubofredwang March 19, 2026 17:58 View session

yubofredwang review requested due to automatic review settings March 19, 2026 17:58

yubofredwang merged commit 2c25d5d into main Mar 19, 2026
3 checks passed

yubofredwang deleted the ywang/support-minimax-training branch March 19, 2026 17:59

zhubohao911 pushed a commit to zhubohao911/TorchSpec that referenced this pull request Mar 23, 2026

[Feature] add Minimax support and Kimi parser updates (torchspec-proj…

7e80465

…ect#45)

zhubohao911 pushed a commit to zhubohao911/TorchSpec that referenced this pull request Mar 23, 2026

[Feature] add Minimax support and Kimi parser updates (torchspec-proj…

d127c38

…ect#45)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] add Minimax support and Kimi parser updates#45

[Feature] add Minimax support and Kimi parser updates#45
yubofredwang merged 5 commits intomainfrom
ywang/support-minimax-training

yubofredwang commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yubofredwang commented Mar 19, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants