A professional audio recording and transcription application for Windows. Record microphone and system audio simultaneously, automatically split into 10-minute blocks, transcribe with Gladia AI, and generate smart notes with Gemini.
- ๐ค Multi-Source Recording - Record microphone and system audio simultaneously
- ๐ฆ Smart Block Management - Automatic 10-minute blocks for cost optimization
- ๐ฎ Playback Preview - Listen to each block before transcribing
- โ Flexible Selection - Choose which blocks to transcribe
- ๐ Gladia Transcription - High-accuracy Turkish transcription
- ๐ค Gemini AI Notes - Automatic note generation and summarization
- ๐พ Markdown Export - Save and share notes easily
- ๐จ Modern UI - Professional interface built with CustomTkinter
- ๐๏ธ Block Management - Delete unwanted blocks
- ๐ Batch Operations - Select all/none with one click
- โฑ๏ธ Progress Tracking - Real-time recording and playback progress
Coming soon - UI screenshots will be added
- Python 3.10 or higher
- Windows OS (for system audio recording)
- Gladia API Key
- Gemini API Key
-
Clone the repository
git clone https://github.com/yourusername/audio_transcriber.git cd audio_transcriber -
Create virtual environment
python -m venv venv venv\Scripts\activate # Windows source venv/bin/activate # Linux/Mac
-
Install dependencies
pip install -r requirements.txt
-
Configure API keys
Copy
.env.exampleto.env:cp .env.example .env
Edit
.envand add your API keys:GLADIA_API_KEY=your-gladia-key-here GEMINI_API_KEY=your-gemini-key-here
-
Run the application
python main.py
-
Select Audio Sources
- Choose microphone from dropdown
- Choose system audio (Stereo Mix/MOTIV Mix) if needed
-
Record Audio
- Click โบ๏ธ "Start Recording" button
- Recording automatically splits into 10-minute blocks
- Click โน๏ธ "Stop" when done
-
Preview Blocks
- Click
โถ๏ธ on any block card to listen - Progress bar shows playback status
- Click โธ๏ธ to pause
- Click
-
Select Blocks
- Use checkboxes to select blocks for transcription
- "All" button selects all blocks
- "None" button deselects all
-
Transcribe
- Click "Transcribe Selected โ"
- Watch progress for each block
- View transcript in the right panel
-
Generate Notes
- Click ๐ค "Generate Notes with Gemini"
- AI analyzes transcript and creates structured notes
- Notes appear in the bottom panel
-
Export
- Click ๐พ "Save as Markdown"
- Choose location and filename
- Share your notes!
To record system audio, enable "Stereo Mix":
- Right-click speaker icon โ Sound Settings
- Click Sound Control Panel โ Recording tab
- Right-click empty space โ Show Disabled Devices
- Right-click Stereo Mix โ Enable
- Set as default or select in the app
| Variable | Description | Required |
|---|---|---|
GLADIA_API_KEY |
API key from Gladia.io | Yes |
GEMINI_API_KEY |
API key from Google AI Studio | Yes |
| Setting | Default | Description |
|---|---|---|
SAMPLE_RATE |
44100 | Audio sample rate in Hz |
BLOCK_DURATION_MINUTES |
10 | Recording block duration |
RECORDINGS_DIR |
"recordings" | Directory for audio files |
GEMINI_MODEL |
"gemini-2.5-flash" | Gemini model version |
| Service | Unit Price | 10 min | 1 hour |
|---|---|---|---|
| Gladia | ~$0.0002/sec | ~$0.12 | ~$0.70 |
| Gemini Flash | Free* | $0 | $0 |
*Gemini 2.5 Flash is free within daily limits.
audio_transcriber/
โโโ src/ # Source code
โ โโโ __init__.py
โ โโโ audio_recorder.py # Audio recording module
โ โโโ gladia_service.py # Gladia API integration
โ โโโ gemini_service.py # Gemini AI integration
โ โโโ config.py # Configuration
โโโ main.py # Main application entry point
โโโ recordings/ # Audio files (auto-created)
โโโ requirements.txt # Dependencies
โโโ .env.example # Environment variables template
โโโ .gitignore # Git ignore rules
โโโ LICENSE # MIT License
โโโ README.md # English documentation
โโโ README_TR.md # Turkish documentation
- Check default microphone in Windows Sound Settings
- Verify microphone permissions for the application
- Enable "Stereo Mix" or "Stereo Karฤฑลฤฑmฤฑ" in Windows:
- Sound Settings โ Recording โ Right-click โ Show Disabled Devices
- Enable Stereo Mix/Karฤฑลฤฑmฤฑ
- Verify API key is correct
- Check internet connection
- Verify credit balance in Gladia dashboard
- Verify API key is correct
- Check if daily limit exceeded
- Ensure model name is correct
- Real-time transcription
- Speaker diarization (identify different speakers)
- Multiple note templates
- Automatic language detection
- Audio quality indicator
- Keyboard shortcuts/hotkeys
- Multi-language support
- Export to PDF and DOCX
- Cloud storage integration
- Collaboration features
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Gladia - Transcription API
- Google Gemini - AI note generation
- CustomTkinter - Modern UI framework
- sounddevice - Audio I/O library
- soundfile - Audio file operations
Yunus Emre Alpak - @yunusemrealpak
Project Link: https://github.com/yourusername/audio_transcriber