Skip to content

yigitcankzl/ReadAloud

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ReadAloud

Access Any Content. Just Listen.

React FastAPI Gemini Tailwind Vite Python

An AI-powered web accessibility tool that converts any webpage or PDF into natural-sounding audio.

Built for ImpactHacks by HackathonForAll


Demo

ReadAloud Demo

Screenshot 1 Screenshot 2

The Problem

1.3 billion people worldwide live with visual impairments. 700 million have dyslexia. Yet 96% of web pages fail basic accessibility standards.

Most content on the internet is designed to be read, not heard. This creates a massive barrier for people who rely on audio to access information.

Our Solution

ReadAloud bridges this gap. Paste any URL or upload a PDF — our AI extracts the content, optimizes it for listening, and generates natural speech. No accounts, no fees, no barriers.

Features

  • URL & PDF Input — Paste any link or drag & drop a PDF file
  • Full Read / Summary — Listen to the complete content or just the key points
  • 50+ Natural Voices — Kokoro-82M (free, local) with ElevenLabs fallback
  • Waveform Visualizer — Real-time audio visualization via Web Audio API
  • Smart AI Processing — Adapts to news, blogs, docs, forums, papers
  • Playback Controls — Speed, skip, volume, seek, MP3 download
  • 100% Free — Gemini AI + Kokoro TTS, zero API costs
  • Accessible by Design — Full screen reader support, ARIA labels, keyboard shortcuts, clipboard auto-paste

Architecture

┌─────────────┐     ┌──────────────┐     ┌─────────────┐     ┌───────────────┐
│  URL / PDF  │────>│  Extraction  │────>│  Gemini AI  │────>│  TTS Engine   │
│  Input      │     │  & Cleaning  │     │  Optimizer  │     │               │
└─────────────┘     └──────────────┘     └─────────────┘     │ Kokoro (free) │
                     readability-lxml      Full / Summary    │ ElevenLabs    │
                     BeautifulSoup         Content-aware     │ (fallback)    │
                     PyMuPDF (PDF)         optimization      └───────────────┘

Tech Stack

LayerTechnology
Frontend

React Tailwind shadcn/ui Vite

Backend

Python FastAPI

AI & TTS

Gemini Kokoro ElevenLabs

Extraction

BeautifulSoup PyMuPDF readability

Quick Start

Prerequisites

sudo apt install espeak-ng ffmpeg

Backend

cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env  # Add your API keys
uvicorn main:app --reload

Frontend

cd frontend
npm install
npm run dev

Environment Variables

Variable Required Description
GEMINI_API_KEY Yes Google Gemini API key (free tier)
ELEVENLABS_API_KEY No ElevenLabs key (optional, Kokoro is the free default)
DEFAULT_VOICE_ID No Default ElevenLabs voice ID

API

Method Endpoint Description
POST /api/convert Convert URL to audio
POST /api/convert-pdf Convert PDF to audio
GET /api/audio/{job_id} Download generated MP3
GET /api/voices List all available voices
POST /api/convert
{
  "url": "https://example.com/article",
  "mode": "full",
  "voice_id": "af_heart"
}

Mode options: full (complete content) or summary (key points only)

POST /api/convert-pdf

Multipart form data:

  • file — PDF file (max 20MB)
  • modefull or summary
  • voice_id — Voice identifier (optional)

Accessibility

ReadAloud is built with visually impaired users in mind:

  • Screen reader support — Every element has proper ARIA labels, roles, and live regions
  • Keyboard shortcutsCtrl+Enter to convert, Space to play/pause, Escape to dismiss errors
  • Clipboard auto-paste — One-click paste button next to the URL input
  • Live announcements — Screen readers announce conversion progress, completion, and errors
  • Full keyboard navigation — Tab through all controls, arrow keys on the audio seek bar
  • Bookmarklet — One-click "Read this page" from any website

Bookmarklet

Visit /bookmarklet.html in the app and drag the "ReadAloud This Page" button to your bookmarks bar. Then click it on any webpage — ReadAloud opens and automatically converts the page to audio.

The app also supports the ?url= query parameter directly, e.g. http://localhost:5173/?url=https://example.com.

License

MIT


Built by yigitcankzl for ImpactHacks by HackathonForAll

About

AI-powered web accessibility tool that converts any webpage or PDF into natural-sounding audio

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors