🎙️ Audio Transcriber

A professional audio recording and transcription application for Windows. Record microphone and system audio simultaneously, automatically split into 10-minute blocks, transcribe with Gladia AI, and generate smart notes with Gemini.

✨ Features

🎤 Multi-Source Recording - Record microphone and system audio simultaneously
📦 Smart Block Management - Automatic 10-minute blocks for cost optimization
🎮 Playback Preview - Listen to each block before transcribing
✅ Flexible Selection - Choose which blocks to transcribe
📝 Gladia Transcription - High-accuracy Turkish transcription
🤖 Gemini AI Notes - Automatic note generation and summarization
💾 Markdown Export - Save and share notes easily
🎨 Modern UI - Professional interface built with CustomTkinter
🗑️ Block Management - Delete unwanted blocks
📊 Batch Operations - Select all/none with one click
⏱️ Progress Tracking - Real-time recording and playback progress

📸 Screenshots

Coming soon - UI screenshots will be added

🚀 Quick Start

Prerequisites

Python 3.10 or higher
Windows OS (for system audio recording)
Gladia API Key
Gemini API Key

Installation

Clone the repository

git clone https://github.com/yourusername/audio_transcriber.git
cd audio_transcriber

Create virtual environment

python -m venv venv
venv\Scripts\activate  # Windows
source venv/bin/activate  # Linux/Mac

Install dependencies
```
pip install -r requirements.txt
```

Configure API keys

Copy .env.example to .env:

cp .env.example .env

Edit .env and add your API keys:

GLADIA_API_KEY=your-gladia-key-here
GEMINI_API_KEY=your-gemini-key-here

Run the application
```
python main.py
```

📖 Usage

Basic Workflow

Select Audio Sources
- Choose microphone from dropdown
- Choose system audio (Stereo Mix/MOTIV Mix) if needed
Record Audio
- Click ⏺️ "Start Recording" button
- Recording automatically splits into 10-minute blocks
- Click ⏹️ "Stop" when done
Preview Blocks
- Click ▶️ on any block card to listen
- Progress bar shows playback status
- Click ⏸️ to pause
Select Blocks
- Use checkboxes to select blocks for transcription
- "All" button selects all blocks
- "None" button deselects all
Transcribe
- Click "Transcribe Selected →"
- Watch progress for each block
- View transcript in the right panel
Generate Notes
- Click 🤖 "Generate Notes with Gemini"
- AI analyzes transcript and creates structured notes
- Notes appear in the bottom panel
Export
- Click 💾 "Save as Markdown"
- Choose location and filename
- Share your notes!

Enabling System Audio (Windows)

To record system audio, enable "Stereo Mix":

Right-click speaker icon → Sound Settings
Click Sound Control Panel → Recording tab
Right-click empty space → Show Disabled Devices
Right-click Stereo Mix → Enable
Set as default or select in the app

⚙️ Configuration

Environment Variables

Variable	Description	Required
`GLADIA_API_KEY`	API key from Gladia.io	Yes
`GEMINI_API_KEY`	API key from Google AI Studio	Yes

Settings (config.py)

Setting	Default	Description
`SAMPLE_RATE`	44100	Audio sample rate in Hz
`BLOCK_DURATION_MINUTES`	10	Recording block duration
`RECORDINGS_DIR`	"recordings"	Directory for audio files
`GEMINI_MODEL`	"gemini-2.5-flash"	Gemini model version

💰 Cost Estimation

Service	Unit Price	10 min	1 hour
Gladia	~$0.0002/sec	~$0.12	~$0.70
Gemini Flash	Free*	$0	$0

*Gemini 2.5 Flash is free within daily limits.

📁 Project Structure

audio_transcriber/
├── src/                 # Source code
│   ├── __init__.py
│   ├── audio_recorder.py    # Audio recording module
│   ├── gladia_service.py    # Gladia API integration
│   ├── gemini_service.py    # Gemini AI integration
│   └── config.py            # Configuration
├── main.py              # Main application entry point
├── recordings/          # Audio files (auto-created)
├── requirements.txt     # Dependencies
├── .env.example         # Environment variables template
├── .gitignore           # Git ignore rules
├── LICENSE              # MIT License
├── README.md            # English documentation
└── README_TR.md         # Turkish documentation

🔧 Troubleshooting

"Microphone not found" error

Check default microphone in Windows Sound Settings
Verify microphone permissions for the application

"Loopback not found" error

Enable "Stereo Mix" or "Stereo Karışımı" in Windows:
- Sound Settings → Recording → Right-click → Show Disabled Devices
- Enable Stereo Mix/Karışımı

Gladia API errors

Verify API key is correct
Check internet connection
Verify credit balance in Gladia dashboard

Gemini API errors

Verify API key is correct
Check if daily limit exceeded
Ensure model name is correct

🎯 Roadmap

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the project
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Gladia - Transcription API
Google Gemini - AI note generation
CustomTkinter - Modern UI framework
sounddevice - Audio I/O library
soundfile - Audio file operations

📧 Contact

Yunus Emre Alpak - @yunusemrealpak

Project Link: https://github.com/yourusername/audio_transcriber

Made with ❤️ by Yunus Emre Alpak

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Audio Transcriber

✨ Features

📸 Screenshots

🚀 Quick Start

Prerequisites

Installation

📖 Usage

Basic Workflow

Enabling System Audio (Windows)

⚙️ Configuration

Environment Variables

Settings (config.py)

💰 Cost Estimation

📁 Project Structure

🔧 Troubleshooting

"Microphone not found" error

"Loopback not found" error

Gladia API errors

Gemini API errors

🎯 Roadmap

🤝 Contributing

📄 License

🙏 Acknowledgments

📧 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.vscode		.vscode
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_TR.md		README_TR.md
main.py		main.py
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🎙️ Audio Transcriber

✨ Features

📸 Screenshots

🚀 Quick Start

Prerequisites

Installation

📖 Usage

Basic Workflow

Enabling System Audio (Windows)

⚙️ Configuration

Environment Variables

Settings (config.py)

💰 Cost Estimation

📁 Project Structure

🔧 Troubleshooting

"Microphone not found" error

"Loopback not found" error

Gladia API errors

Gemini API errors

🎯 Roadmap

🤝 Contributing

📄 License

🙏 Acknowledgments

📧 Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages