Skip to content

Conversation

@majiayu000
Copy link

Summary

  • Fixed incorrect Japanese language tag in example.py comment
  • Changed <|jp|> to <|ja|> to match the actual tokenizer implementation

Details

The comment at line 21 in example.py incorrectly documents the Japanese language tag as <|jp|>, but the actual LANGUAGES dict in cosyvoice/tokenizer/tokenizer.py defines it as "ja": "japanese" (line 19).

This mismatch can cause confusion and issues like #621 when users follow the documented tag.

Test Plan

  • Verified LANGUAGES dict in tokenizer.py uses ja for Japanese
  • No code changes, only comment fix

Fixes #1683

Changed <|jp|> to <|ja|> to match the actual tokenizer implementation.

The LANGUAGES dict in cosyvoice/tokenizer/tokenizer.py defines 'ja' for Japanese,
not 'jp'. This fixes the misleading comment that could cause issues like FunAudioLLM#621.

Fixes FunAudioLLM#1683
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

日语语言标签为 <|ja|>,而非 <|jp|>,与example中的注释不符

1 participant