Skip to content

logesh4v/voice-assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽค Alexa-like Voice Assistant

A sophisticated conversational AI voice assistant built with AWS Strands, Nova Premiere, and multi-agent guardrails. Experience natural voice conversations with enterprise-grade safety validation.

โœจ Features

  • ๐ŸŽค Voice & Text Input: Seamless speech recognition and text input
  • ๐Ÿ”Š Text-to-Speech: Natural voice responses with speech synthesis
  • ๐Ÿ›ก๏ธ Multi-Layer Safety: Advanced guardrails for content validation
  • ๐Ÿ’ฌ Conversational AI: Context-aware responses using Nova Premiere
  • ๐ŸŒ Modern Web Interface: Responsive, accessible chat interface
  • โšก Real-time Processing: Fast multi-agent orchestration
  • ๐Ÿ“Š Health Monitoring: System status and performance metrics
  • ๐Ÿ”„ Session Management: Persistent conversation context

๐Ÿ—๏ธ Architecture

Multi-Agent System

  • ChatAgent: Generates natural conversational responses using Nova Premiere model
  • SafetyAgent: Validates responses using Strands guardrails (toxicity, relevance, grounding)
  • Orchestrator: Coordinates multi-agent workflow and session management

Technology Stack

  • Backend: FastAPI with async processing
  • Frontend: Vanilla JavaScript with Web Speech APIs
  • AI Model: AWS Nova Premiere via Strands SDK
  • Safety: Built-in Strands guardrails system
  • Deployment: Docker containerization ready

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.8+
  • AWS account with Bedrock access
  • Modern web browser (Chrome, Firefox, Safari, Edge)

Option 1: Automated Setup (Recommended)

Windows:

scripts\start.bat

Linux/macOS:

chmod +x scripts/start.sh
./scripts/start.sh

Option 2: Manual Setup

  1. Clone and setup environment:
git clone <repository>
cd alexa-voice-assistant
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt
  1. Configure AWS credentials:
cp .env.example .env

Edit .env file:

AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=your_access_key_here
AWS_SECRET_ACCESS_KEY=your_secret_key_here
DEBUG=True
LOG_LEVEL=INFO
  1. Test the system:
python test_system.py
  1. Start the application:
python main.py
  1. Open your browser: Navigate to http://localhost:8000

๐Ÿณ Docker Deployment

Using Docker Compose (Recommended)

# Set environment variables
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret

# Start the application
docker-compose up -d

Using Docker directly

# Build the image
docker build -t alexa-voice-assistant .

# Run the container
docker run -d \
  -p 8000:8000 \
  -e AWS_ACCESS_KEY_ID=your_key \
  -e AWS_SECRET_ACCESS_KEY=your_secret \
  -e AWS_REGION=us-east-1 \
  alexa-voice-assistant

๐Ÿ’ฌ Usage Guide

Voice Interaction

  1. Click the microphone button ๐ŸŽค to start voice recognition
  2. Speak clearly into your microphone
  3. Wait for processing - the system will convert speech to text
  4. Listen to the response - the assistant will speak back to you

Text Interaction

  1. Type your message in the text input field
  2. Press Enter or click the Send button
  3. Read and listen to the assistant's response

Interface Controls

  • ๐ŸŽค Speak Button: Activate voice recognition
  • Send Button: Submit text message
  • Clear Button: Reset conversation history
  • System Status: View health and performance metrics

Browser Permissions

  • Microphone Access: Required for voice input
  • Audio Playback: Required for voice responses
  • JavaScript: Required for full functionality

๐Ÿ›ก๏ธ Safety Features

Multi-Layer Validation

  • Toxicity Detection: Blocks harmful or offensive content
  • Relevance Filtering: Ensures responses are on-topic
  • Grounding Checks: Prevents hallucinations and false information
  • Content Safety: Validates appropriateness of responses

Fallback Responses

When safety violations are detected, the system provides appropriate fallback messages:

  • "I'm sorry, I can't provide that type of information. Let's try a different topic."
  • "I'm not sure about that. Could you ask me something else?"
  • "I don't have reliable information about that right now."

๐Ÿ“Š Monitoring & Health

System Health Endpoint

curl http://localhost:8000/api/health

Health Indicators

  • Green: System operating normally
  • Yellow: Minor issues detected
  • Red: System errors or failures

Metrics Available

  • Active session count
  • Safety incident statistics
  • Response time performance
  • Agent status monitoring

๐Ÿ”ง Configuration

Environment Variables

Variable Description Default
AWS_REGION AWS region for Bedrock us-east-1
AWS_ACCESS_KEY_ID AWS access key Required
AWS_SECRET_ACCESS_KEY AWS secret key Required
DEBUG Enable debug mode False
LOG_LEVEL Logging level INFO
SESSION_TIMEOUT Session timeout (seconds) 3600

Model Configuration

  • Model: amazon.nova-premiere-v1:0
  • Temperature: 0.7 (balanced creativity)
  • Max Tokens: 2048 (sufficient for conversations)
  • Top-p: 0.9 (focused responses)

๐Ÿงช Testing

Run System Tests

python test_system.py

Test Coverage

  • โœ… Agent initialization and configuration
  • โœ… Multi-agent orchestration workflow
  • โœ… Safety validation and guardrails
  • โœ… Session management and cleanup
  • โœ… Error handling and recovery
  • โœ… API endpoint functionality

๐Ÿ” Troubleshooting

Common Issues

"Speech recognition not supported"

  • Use a modern browser (Chrome, Firefox, Safari, Edge)
  • Ensure HTTPS or localhost for microphone access

"AWS credentials not found"

  • Check your .env file configuration
  • Verify AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are set
  • Ensure your AWS account has Bedrock access

"Model unavailable"

  • Verify your AWS region supports Nova Premiere
  • Check your AWS account has Bedrock model access
  • Try switching to us-east-1 region

"Microphone permission denied"

  • Allow microphone access in browser settings
  • Refresh the page after granting permissions
  • Check browser security settings

Debug Mode

Enable debug mode for detailed logging:

export DEBUG=True
export LOG_LEVEL=DEBUG
python main.py

Logs Location

  • Development: Console output
  • Docker: /app/logs/ directory
  • Production: Configure external log aggregation

๐Ÿค Contributing

Development Setup

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: python test_system.py
  5. Submit a pull request

Code Style

  • Follow PEP 8 guidelines
  • Use type hints where appropriate
  • Add docstrings for functions and classes
  • Include error handling and logging

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ†˜ Support

For issues and questions:

  1. Check the troubleshooting section above
  2. Review the system health endpoint
  3. Check application logs for error details
  4. Ensure all prerequisites are met

๐Ÿ”ฎ Future Enhancements

  • Multi-language support
  • Custom voice selection
  • Conversation export/import
  • Advanced analytics dashboard
  • Integration with external APIs
  • Mobile app companion

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors