Simple Transcriber for Podcasts & Videos

A small desktop app that turns a podcast or video URL into a speaker-labeled, audio-synced HTML transcript you can read, edit, and quote from. Meant to replace the functionality, ease of use, speed and accuracy of Scroll.ai's transcription capabilities.

Paste a YouTube link or a podcast URL, get back a transcript with:

Per-paragraph play buttons synced to the audio
Word-level highlighting that follows playback
Editable speaker labels and title
Inline text editing for fixing mis-transcribed names
Paragraph bookmarks collected at the top as "Saved quotes"
A browsable library of every transcript you've made

Built as a personal replacement for Scroll, which shut down in 2026 and had offered free access for journalists and freelancers. Uses Groq's whisper-large-v3 for transcription and AssemblyAI for speaker diarization.

If you're looking for a more robust but still easy-to-install version and/or a Docker setup, try Easy Transcriber from ReadTedium.

Install (Windows)

Download SimpleTranscriber-Setup.exe from the Releases page.
Run it. The installer handles WebView2 (if not already on your system), creates a Start menu and desktop shortcut, and launches the app.
On first launch, paste your two API keys (see below). They're saved locally to config.json next to the app — nothing leaves your machine except the audio you submit for transcription.

That's it. After setup, paste a URL and click Transcribe.

Getting API keys

Both are free, take under a minute, and have generous free tiers.

Service	Sign up	Notes
Groq	https://console.groq.com → API Keys	Used for transcription. ~$0.11/hr of audio.
AssemblyAI	https://www.assemblyai.com → Dashboard	Used for speaker diarization. $50 free credit covers months of casual use.

A typical 45-minute interview costs well under $0.50 total.

Features

Per-paragraph playback — click ▶ next to any paragraph to jump there
Word-level highlighting — the active word lights up as audio plays
Auto-scroll with "Jump to now" pill — follows playback; pauses scrolling when you scroll up to re-read, with a one-click way back to the live position
Editable speakers — click a label to rename ("Speaker A" → "Jane Smith"); choose to rename just one occurrence or all matching labels
Editable text and title — click any paragraph or the title to fix typos
Bookmarks — star paragraphs worth quoting; they appear in a "Saved quotes" section at the top with jump links
Hints field — paste proper nouns ("John Doe, ACME Co., NASA") before transcribing to help Whisper spell them correctly
URL queue — paste multiple URLs (one per line) to process in sequence. Press Shift+Enter to start a new line.
Library page — every transcript, searchable by title, date, or transcript content, sorted by when you transcribed it
Copy and export — copy a single quote (pre-formatted with attribution and timestamp), copy the full transcript, or export as .txt / .md

Keyboard shortcuts (in a transcript)

Key	Action
`Space`	Play / pause
`←`	Seek back 10 seconds
`→`	Seek forward 10 seconds

Shortcuts are disabled while editing text so they don't interfere with typing.

Where files live

Transcripts: Desktop\Transcripts\<title>\ — each gets its own folder with the HTML transcript, MP3 audio, and a small .meta.json sidecar
Library index: Desktop\Transcripts\index.html — regenerated after every run
Config: config.json next to the installed .exe — your API keys and window geometry

You can move the Transcripts folder if you want — the app reads/writes to %USERPROFILE%\Desktop\Transcripts.

Privacy

Audio you transcribe is uploaded to Groq (transcription) and AssemblyAI (diarization). Both have published privacy policies; neither is used for training by default. Everything else — your API keys, the transcripts themselves, the library — stays on your machine.

Limits

Groq free tier caps uploads at 25MB. The app encodes audio as mono MP3 at 32kbps (~14MB/hour), so typical interviews up to ~90 min fit comfortably. Longer files are auto-chunked and stitched back together.
Transcription quality is high (Whisper large-v3, ~95–98% on clear audio). Speaker labels are good but not perfect — expect occasional misattributions at speaker transitions, especially with more than two speakers. Suitable as a readable working transcript, not a substitute for human review before publication.

Install from source (instead of the installer)

If you'd rather run the Python directly:

git clone git clone https://github.com/jcddc83/simple-transcriber.git
cd simple-transcriber
pip install -r requirements.txt
python transcribe.py

Requires Python 3.11+ and FFmpeg on your PATH (or ffmpeg.exe next to the script). On Windows, the easiest FFmpeg install is via gyan.dev — download "release essentials", unzip, and drop ffmpeg.exe next to transcribe.py.

Building the installer

build.bat
iscc installer.iss

build.bat downloads FFmpeg and the WebView2 bootstrapper automatically if they aren't already present, then packages everything into dist\SimpleTranscriber.exe via PyInstaller. iscc installer.iss (from Inno Setup) wraps it into Output\SimpleTranscriber-Setup.exe.

macOS

A build.sh script exists for producing a Mac .app bundle, but it hasn't been tested end-to-end yet. The Python code itself is cross-platform; pywebview uses WKWebView on macOS (built in, no extra dependency).

Troubleshooting

"Windows protected your PC" on first launch — because the app isn't code-signed, Windows SmartScreen shows an "unrecognized app" warning the first time you run the installer. Click More info → Run anyway. This is expected for independent apps without a commercial code-signing certificate.
Antivirus false positive — PyInstaller-packaged apps occasionally trigger Windows Defender or other AV scanners (the unpack-then-run pattern looks suspicious to heuristic scanners). If your AV quarantines the installer, add an exception for SimpleTranscriber-Setup.exe, or run from source instead.
"ffmpeg not found" (running from source) — make sure ffmpeg.exe is in the transcriber/ folder or on your system PATH.
YouTube rate-limited — yt-dlp occasionally gets throttled by YouTube. Wait an hour and try again, or run with browser cookies (advanced).
Audio >25MB after encoding — automatic chunking handles it, but very long files (3+ hours) may need manual splitting.
Window opens but stays blank — make sure WebView2 Runtime is installed. The installer handles this; if you're running from source, download it from Microsoft.

Credits

Transcription: Groq (whisper-large-v3)
Diarization: AssemblyAI
Audio download: yt-dlp
Desktop window: pywebview
App icon: Transcription icons created by Freepik — Flaticon

License

Personal use, MIT-licensed. See LICENSE if present.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.ico		app.ico
build.bat		build.bat
build.sh		build.sh
installer.iss		installer.iss
make_icon.py		make_icon.py
requirements.txt		requirements.txt
run-dev.bat		run-dev.bat
screenshot.png		screenshot.png
transcribe.py		transcribe.py
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple Transcriber for Podcasts & Videos

Install (Windows)

Getting API keys

Features

Keyboard shortcuts (in a transcript)

Where files live

Privacy

Limits

Install from source (instead of the installer)

Building the installer

macOS

Troubleshooting

Credits

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Simple Transcriber for Podcasts & Videos

Install (Windows)

Getting API keys

Features

Keyboard shortcuts (in a transcript)

Where files live

Privacy

Limits

Install from source (instead of the installer)

Building the installer

macOS

Troubleshooting

Credits

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages