Prompt-to-Audio-Generative-AI-Sound-Creator

Generate realistic sound effects from simple text prompts using Stable Audio, Hugging Face Diffusers, and Gradio.

Prompt2Audio is a Generative AI project that converts natural language descriptions into high-quality audio clips. By leveraging diffusion-based audio models, users can create sound effects such as rain, hammer strikes, environmental sounds, and more with just a text input.

🚀 Project Overview

Prompt2Audio demonstrates how Generative AI can transform text into audio using state-of-the-art diffusion models.

Users simply enter a description like:

"Sound of rain falling on a metal roof during a storm."

The model generates a realistic audio waveform based on the prompt.

The project also includes an interactive Gradio interface, allowing users to easily experiment with different sound prompts and durations.

🛠 Tech Stack

Python
PyTorch
Hugging Face Diffusers
Stable Audio Open 1.0
Gradio
Google Colab (GPU)
SoundFile

⚙️ How It Works

Mount Google Drive in Google Colab
Authenticate with Hugging Face
Install required dependencies
Load the StableAudioPipeline model
Provide a text prompt describing a sound
The diffusion model generates an audio waveform
Save the generated output as a .wav file
Play the audio using the Gradio interface

🎧 Example Prompt

Prompt:
"The sound of a hammer hitting a wooden surface."

Negative Prompt:
"Low quality"

Output → A 10-second realistic hammer sound effect.

▶️ Run the Application

Start the Gradio interface:

python app.py

This will launch a local web interface where you can generate sounds from text prompts.

🎚 Features

Generate realistic audio from text
Adjustable audio duration (1–20 seconds)
Negative prompts for better output control
Interactive web interface using Gradio
GPU accelerated inference

📸 Demo Interface

Users can:

Enter a sound description
Add a negative prompt
Adjust audio duration
Generate and listen to the sound instantly

📚 Key Learnings

Diffusion models for audio generation
Using Hugging Face pipelines
Building AI interfaces with Gradio
Running generative models on GPU

🤝 Contributing

Contributions are welcome!

If you'd like to improve this project:

Fork the repository
Create a new branch
Submit a pull request

👨‍💻 Author

AKSHITHA HIRAKARI

AI / Machine Learning Enthusiast Passionate about building Generative AI applications

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
text_to_audio(stable_diffusion).py		text_to_audio(stable_diffusion).py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prompt-to-Audio-Generative-AI-Sound-Creator

🚀 Project Overview

🛠 Tech Stack

⚙️ How It Works

🎧 Example Prompt

▶️ Run the Application

🎚 Features

📸 Demo Interface

📚 Key Learnings

🤝 Contributing

👨‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Prompt-to-Audio-Generative-AI-Sound-Creator

🚀 Project Overview

🛠 Tech Stack

⚙️ How It Works

🎧 Example Prompt

▶️ Run the Application

🎚 Features

📸 Demo Interface

📚 Key Learnings

🤝 Contributing

👨‍💻 Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages