Question: Best practices for text padding to improve audio generation endings

Hi! First off, thanks for creating KittenTTS - it's been great for adding audio feedback to my projects.

I've noticed that audio generation sometimes cuts off abruptly at the end of messages, making the speech sound unnatural or incomplete. I've been experimenting with adding padding to the text to get smoother endings.

What I've tried

Currently, I'm appending "...." (five dots) to the end of messages, which seems to help with the audio rendering and provides a more natural trailing off but there are sometimes some artifacts. I'm also using this pattern within messages for better speech cadence.

 Questions

1. Is there a recommended approach for padding text to ensure clean audio endings?
2. Are there specific characters or patterns that work better than others for this purpose?

Example

// Current approach
speak({ text: "Task completed successfully....." })

// vs without padding
speak({ text: "Task completed successfully" })  // Sometimes cuts off abruptly

The padded version seems to give the TTS engine something to "decay" into, but I'm wondering if there's a more elegant solution or if I'm approaching this the wrong way.

Any guidance would be appreciated! Happy to contribute to docs if there's a standardized approach you'd recommend.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Best practices for text padding to improve audio generation endings #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question: Best practices for text padding to improve audio generation endings #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions