streaming tts with piper

Piper official repo, as of now, splits the input text into sentences using phonemize().
It generates audio chunk for whole sentence at once and then "yields" audio for each sentence one aftyer the other.

So, if the first sentence in the input text is long, it will some time to generate the audio.

Does this repo follow the same if i select piperEngine? or does this repo somehow genrate audio for each word by word instead of sentence by sentence? If it produces word by word , latency will be much lower.

Genral Question: Is there a opern source TTS system which has bith
1. generates speech word by word or like sub-sentence level, so that latency is lower
2. Has script for fine tuning the pre-trained model with our custom data. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

streaming tts with piper #339

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

streaming tts with piper #339

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions