Init kimi into utterance_metrics by Stanwang1210 · Pull Request #43 · wavlab-speech/versa

Stanwang1210 · 2025-06-21T09:03:45Z

This PR integrates the Kimi-Audio model into the evaluation pipeline.

The overall implementation structure follows the pattern established in qwen2_audio.py, ensuring consistency with existing audio model integrations. Key components include:
• Preprocessing and tokenization of audio inputs using Kimi-Audio’s API
• Compatibility with the existing evaluation framework

Let me know if any adjustments are needed, especially regarding tokenizer behavior or input formatting. I’m happy to iterate!

ftshijt · 2025-06-21T15:58:28Z

Many thanks! One minor suggestion is to use the existing prompts so that we can simplify the scripts a lot. I will have a further detailed check soon. Appreciate!

Stanwang1210 · 2025-11-25T23:51:59Z

I’ve incorporated existing prompts so the scripts are now simpler. Please take a look at the current version.

Stanwang1210 force-pushed the main branch from e8cb4a1 to f96594d Compare June 24, 2025 09:18

init kimi

9201976

Stanwang1210 force-pushed the main branch from f96594d to 9201976 Compare November 25, 2025 22:32

fix kimi bug

c0bd62c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Init kimi into utterance_metrics#43

Init kimi into utterance_metrics#43
Stanwang1210 wants to merge 2 commits intowavlab-speech:mainfrom
Stanwang1210:main

Stanwang1210 commented Jun 21, 2025

Uh oh!

ftshijt commented Jun 21, 2025

Uh oh!

Stanwang1210 commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Stanwang1210 commented Jun 21, 2025

Uh oh!

ftshijt commented Jun 21, 2025

Uh oh!

Stanwang1210 commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants