Skip to content

Niharika07-B/AWS-Text-2-Audio-Converter-Project

Repository files navigation

HTML CSS JavaScript


πŸŽ™οΈSpeak AI β€” A Text-to-Speech Web App using AWS☁️

✨ Turn your words into voice. In real time. In any accent. Straight from the cloud.

Welcome to Speak AI β€” a serverless web app that literally talks back. Type a message, pick a voice, and within seconds, hear it spoken out loud using the power of Amazon Polly. No downloads. No installs. Just your browser, the cloud, and a bit of magic.


🌍 Live Demo

🎧 Try the app in action:
πŸ”— Click here to launch Speak AI

No installs. Just your browser, some text, and a voice from the cloud!


🧠 What is Speak AI?

Speak AI is more than just a text-to-speech app β€” it's a cloud-native, fully serverless voice generation pipeline built using AWS.

Behind the scenes, it uses a suite of AWS services working in harmony:

  • πŸ” Asynchronous processing with SNS

  • 🧠 Stateless logic with Lambda

  • πŸ“¦ Durable storage via S3 and DynamoDB

  • 🎀 Realistic voices with Amazon Polly

  • 🌐 Public APIs through API Gateway


⚑ Quick Features

 βœ… Enter text in your browser
 

 βœ… Choose from multiple Amazon Polly voices
 

 βœ… Get an audio file generated in seconds
 

 βœ… Scalable to thousands of requests
 

 βœ… No backend servers, no EC2s, no containers
 

 βœ… 100% Serverless. 100% AWS.

🧬 Architecture at a Glance

AWS_Architecture_Diagram


πŸ“ Project Structure

AWS-Text-2-Audio-Converter-Project/
β”‚
β”œβ”€β”€ serverless-web/
β”‚   β”œβ”€β”€ index.html          # 🌐 Main Web UI
β”‚   β”œβ”€β”€ styles.css          # 🎨 Frontend styles
β”‚   β”œβ”€β”€ scripts.js          # βš™οΈ Frontend logic & API calls
β”‚   └── error.html          # 🚫 Error fallback page
β”‚
β”œβ”€β”€ add_new_posts.py        # πŸ“€ Lambda: Text submit & SNS trigger
β”œβ”€β”€ convert_text_to_audio.py# πŸ”Š Lambda: Polly text-to-speech
β”œβ”€β”€ read_table_items.py     # πŸ“₯ Lambda: Read from DynamoDB
β”‚
β”œβ”€β”€ arc_diagram.png         # 🧭 Architecture visual
β”œβ”€β”€ iam-policy.txt          # πŸ” IAM Roles & Permissions
β”œβ”€β”€ s3-bucket-policy        # πŸ“‚ S3 policy for public access
└── README.md               # πŸ“˜ Project overview

πŸ› οΈ AWS Services Used

AWS Service Purpose
Amazon Polly Converts text into realistic MP3 speech
AWS Lambda Serverless compute (3 functions for different stages)
Amazon S3 Stores MP3 files and optionally hosts frontend
DynamoDB NoSQL storage for request data and audio URLs
SNS Publishes events and triggers processing asynchronously
API Gateway Connects frontend to Lambda securely via HTTP endpoints

πŸš€ Getting Started

πŸ”§ Prerequisites

  • AWS account with programmatic access

  • IAM roles set up with proper permissions

  • Bucket with public-read policy (see s3-bucket-policy)


🧱 Project Overview: AWS Text-to-Audio Converter

🎨 Frontend (Static Web)

Files: index.html, styles.css, scripts.js

  • πŸ–‹οΈ Text Input: User submits text and selects a voice.
  • πŸ—‚οΈ Voice Selector: Dropdown to choose a Polly voice.
  • 🎧 Result Display: Shows playback links for generated audio.
  • πŸ“‘ Communicates with backend via REST (API Gateway).

πŸ”§ Backend (Serverless + AWS)

πŸ› οΈ API Gateway

  • Exposes secure RESTful endpoints for frontend interactions.
  • Triggers Lambda functions upon requests.
  • abcd3

🧠 Lambda Functions

  • add_new_posts.py πŸ“€
    ➀ Receives new text input
    ➀ Stores in DynamoDB
    ➀ Publishes event to SNS

  • convert_text_to_audio.py πŸ”Š
    ➀ Triggered by SNS
    ➀ Reads text/voice from DynamoDB
    ➀ Converts text to audio using Amazon Polly
    ➀ Uploads .mp3 to S3
    ➀ Updates DynamoDB with the audio URL & status

  • read_table_items.py πŸ“₯
    ➀ Reads all entries from DynamoDB
    ➀ Returns data to frontend for listing


πŸ—„οΈ Data & Event Infrastructure

  • 🧾 DynamoDB

    • Stores: text, voice, status, audio_url
    • Used for both queueing and retrieving data
    • abcd2
  • πŸ“¬ SNS (Simple Notification Service)

    • Decouples text submission from audio processing
    • Enables async invocation of audio conversion
  • πŸ—£οΈ Amazon Polly

    • Converts submitted text into lifelike speech
    • Supports multiple languages and voices
  • πŸ—ƒοΈ Amazon S3

    • Stores generated .mp3 files
    • Publicly accessible URLs for playback in frontend
    • abcd1

πŸ“Έ Project Screenshots

abcd4

abcd5


πŸ“ Notes to remember

Region:
πŸ”Ή Make sure all AWS services (Lambda, DynamoDB, S3, SNS, Polly) are deployed in the us-east-1 region for seamless integration.

Purpose:
πŸ”Ή This application is intended for educational and demonstration purposes. It is not production-hardened.


⚠️ Before Public Deployment, Consider These Improvements:

1️⃣ Input Validation

  • πŸ”Έ Sanitize text input to prevent injection attacks or invalid data.
  • πŸ”Έ Set character limits and allow only supported languages.

2️⃣ Error Handling

  • πŸ”Έ Add try/catch blocks in all Lambda functions.
  • πŸ”Έ Show clear frontend error messages (e.g., "Audio failed to generate").
  • πŸ”Έ Implement retry logic for transient AWS service errors.

3️⃣ Security Best Practices

  • πŸ” IAM Roles

    • Use least privilege access policies.
    • Avoid AdministratorAccess on deployed functions.
    • Define resource-specific ARNs in permissions.
  • πŸ” S3 & DynamoDB Access

    • Do not expose DynamoDB or sensitive S3 data publicly.
    • Use signed URLs if public access is needed.

4️⃣ Monitoring & Alerts

  • πŸ“Š Enable CloudWatch Logs for all Lambda functions.
  • ⏰ Create CloudWatch alarms for function errors or high latency.
  • πŸ’° Set up Billing Alerts to track AWS usage and avoid surprise costs.

Show your support

Give a ⭐ if you like this!


About

πŸ—£οΈTurn your words into lifelike speech with Speak πŸ€–, powered by AWSπŸ’œ and Amazon Polly. Type, pick a voice, and listen β€” it’s instant, cloud-powered, and works right in your browser.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors