Skip to content
Change the repository type filter

All

    Repositories list

    • voicebox

      Public
      The open-source voice synthesis studio
      TypeScript
      MIT License
      2.9k000Updated Apr 18, 2026Apr 18, 2026
    • VoiceStudio: A unified toolkit for text-style prompted speech synthesis, voice adaptation, and editing
      Jupyter Notebook
      Apache License 2.0
      1200Updated Mar 29, 2026Mar 29, 2026
    • Inference and training library for high-quality TTS models.
      Python
      Apache License 2.0
      587000Updated Mar 27, 2026Mar 27, 2026
    • Qwen3-TTS

      Public
      Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, fr…
      Python
      Apache License 2.0
      1.5k000Updated Feb 12, 2026Feb 12, 2026
    • CosyVoice

      Public
      Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
      Python
      Apache License 2.0
      2.4k000Updated Jan 24, 2026Jan 24, 2026
    • Text-audio foundation model from Boson AI
      Python
      Apache License 2.0
      619000Updated Jan 24, 2026Jan 24, 2026
    • Object-oriented handling of audio data, with GPU-powered augmentations, and more.
      Python
      MIT License
      77100Updated Jan 24, 2026Jan 24, 2026
    • Chroma

      Public
      Worlds first open-source real-time end-to-end spoken dialogue model with personalized voice cloning.
      Jupyter Notebook
      Apache License 2.0
      60000Updated Jan 23, 2026Jan 23, 2026
    • Spark-TTS

      Public
      Spark-TTS Inference Code
      Python
      Apache License 2.0
      1.2k000Updated Jan 23, 2026Jan 23, 2026
    • PersonaPlex code.
      Python
      MIT License
      1.4k000Updated Jan 20, 2026Jan 20, 2026
    • dia

      Public
      A TTS model capable of generating ultra-realistic dialogue in one pass.
      Python
      Apache License 2.0
      1.7k000Updated Jan 11, 2026Jan 11, 2026
    • F5-TTS

      Public
      Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
      Python
      MIT License
      2.1k000Updated Jan 11, 2026Jan 11, 2026
    • A PyTorch-based Speech Toolkit
      Python
      Apache License 2.0
      1.7k000Updated Jan 11, 2026Jan 11, 2026
    • vocos

      Public
      Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
      Python
      MIT License
      129000Updated Jan 8, 2026Jan 8, 2026
    • [NeurIPS' 25] Benchmark for evaluating TTS models on complex prosodic, expressiveness, and linguistic challenges.
      Python
      Apache License 2.0
      15000Updated Dec 9, 2025Dec 9, 2025
    • CausVid

      Public
      (CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
      Python
      Other
      80000Updated Aug 7, 2025Aug 7, 2025
    • Homepage of LatentForge
      CSS
      0000Updated Jul 21, 2025Jul 21, 2025
    • Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along wit…
      Jupyter Notebook
      MIT License
      2.6k000Updated Mar 13, 2025Mar 13, 2025
    • VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling
      Python
      MIT License
      7000Updated Nov 9, 2024Nov 9, 2024
    • PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions
      Python
      Apache License 2.0
      7000Updated Oct 11, 2024Oct 11, 2024
    • CLAP

      Public
      Learning audio concepts from natural language supervision
      Python
      MIT License
      46000Updated Sep 18, 2024Sep 18, 2024
    • ImageBind

      Public
      ImageBind One Embedding Space to Bind Them All
      Python
      Other
      846000Updated Jul 31, 2024Jul 31, 2024
    • LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning
      3000Updated Jun 13, 2024Jun 13, 2024
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.