Checks
Environment Details
ubuntu 22.04, Python 3.10, torch 2.4.0+cu124
Steps to Reproduce
What recommendations can be made for max_chars \ few_chars \ min_chars? Text chunking is performed via the chunk_text method.
Currently, in Chinese text testing, if the above three parameters are all fixed at 60, which results in occasional missing characters at the beginning of chunks.
If I directly calculate it using the default logic in socket_server.py, some garbled characters (a few random characters) appear at the end of chunks.
ref_text_byte_len = len(self.ref_text.encode("utf-8"))
self.max_chars = int(ref_text_byte_len / (ref_audio_duration) * (25 - ref_audio_duration))
self.few_chars = int(ref_text_byte_len / (ref_audio_duration) * (25 - ref_audio_duration) / 2)
self.min_chars = int(ref_text_byte_len / (ref_audio_duration) * (25 - ref_audio_duration) / 4)
Additionally: When few_chars and min_chars in socket_server.py are processed consecutively for first sentence, could this cause the earlier chunks to be split into overly small pieces?
Thanks you!
✔️ Expected Behavior
No response
❌ Actual Behavior
No response
Checks
Environment Details
ubuntu 22.04, Python 3.10, torch 2.4.0+cu124
Steps to Reproduce
What recommendations can be made for max_chars \ few_chars \ min_chars? Text chunking is performed via the chunk_text method.
Currently, in Chinese text testing, if the above three parameters are all fixed at 60, which results in occasional missing characters at the beginning of chunks.
If I directly calculate it using the default logic in socket_server.py, some garbled characters (a few random characters) appear at the end of chunks.
Additionally: When few_chars and min_chars in socket_server.py are processed consecutively for first sentence, could this cause the earlier chunks to be split into overly small pieces?
Thanks you!
✔️ Expected Behavior
No response
❌ Actual Behavior
No response