Skip to content

Init kimi into utterance_metrics#43

Open
Stanwang1210 wants to merge 2 commits intowavlab-speech:mainfrom
Stanwang1210:main
Open

Init kimi into utterance_metrics#43
Stanwang1210 wants to merge 2 commits intowavlab-speech:mainfrom
Stanwang1210:main

Conversation

@Stanwang1210
Copy link

This PR integrates the Kimi-Audio model into the evaluation pipeline.

The overall implementation structure follows the pattern established in qwen2_audio.py, ensuring consistency with existing audio model integrations. Key components include:
• Preprocessing and tokenization of audio inputs using Kimi-Audio’s API
• Compatibility with the existing evaluation framework

Let me know if any adjustments are needed, especially regarding tokenizer behavior or input formatting. I’m happy to iterate!

@ftshijt
Copy link
Contributor

ftshijt commented Jun 21, 2025

Many thanks! One minor suggestion is to use the existing prompts so that we can simplify the scripts a lot. I will have a further detailed check soon. Appreciate!

@Stanwang1210
Copy link
Author

I’ve incorporated existing prompts so the scripts are now simpler. Please take a look at the current version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants