Norðlenski hreimurinn

Local speech recognition for Icelandic and English — 100% private, runs on your machine.

Norðlenski hreimurinn uses fine-tuned Whisper models to transcribe audio locally. No audio ever leaves your computer. Supports microphone recording and file upload with export to TXT and SRT.

Quick Start

# Install system dependencies
brew install portaudio ffmpeg        # macOS
# sudo apt install portaudio19-dev ffmpeg   # Ubuntu/Debian

# Clone and install
git clone https://github.com/Magnussmari/whisperSSTis.git
cd whisperSSTis
uv sync --all-extras

# Run
uv run streamlit run app.py

Opens at http://localhost:8501. First launch downloads the model (~4 GB).

Features

Record or upload — microphone recording with waveform visualization, or upload WAV/MP3/M4A/FLAC
Two language models — Icelandic (fine-tuned) and English (Whisper Large v3), switchable in the sidebar
100% local — all transcription runs on your machine. GPU accelerated when available, CPU fallback.
Timestamped export — download transcripts as TXT or SRT subtitle files
AI assistant (optional) — GPT post-processing for summarization, translation, cleanup. Requires OPENAI_API_KEY. Only sends text, never audio.
Bilingual UI — Icelandic and English labels throughout

Prerequisites

Python 3.10+
uv package manager
PortAudio — for microphone capture
FFmpeg — for non-WAV audio conversion
~4 GB disk space per model
CUDA GPU recommended (not required)

Configuration

Copy env.example to .env for optional GPT features:

OPENAI_API_KEY=sk-your-key    # Required for AI assistant only
GPT_MINI_MODEL=gpt-4o-mini    # Optional: override model

Project Structure

app.py                     Main Streamlit web application
launcher.py                Tkinter desktop launcher
whisperSSTis/              Core Python package
  audio.py                   Audio capture, file loading, ffmpeg conversion
  transcribe.py              Whisper model loading and inference
  gpt.py                     Optional GPT post-processing
tests/                     pytest test suite (26 tests)
scripts/                   Build and distribution scripts
archive/                   Superseded documentation
docs/missions/             Development mission reports
.github/workflows/ci.yml   GitHub Actions CI
pyproject.toml             Package config, dependencies, tool settings
uv.lock                    Reproducible dependency lockfile
architecture.jsonld        Machine-readable system architecture graph

Development

# Install with all extras
uv sync --all-extras

# Run tests
uv run pytest

# Run the app
uv run streamlit run app.py

# Run via desktop launcher
uv run python launcher.py

Testing

26 tests covering all three core modules (audio, transcribe, gpt). All tests mock hardware — no GPU or microphone needed in CI.

uv run pytest -v              # Full verbose suite
uv run pytest tests/test_gpt.py   # Single module

CI

GitHub Actions runs on every push to main and on PRs:

Python 3.10, 3.11, 3.12
Full test suite
pip-audit vulnerability scanning

Technical Details

Component	Details
Frontend	Streamlit with custom CSS design system
Models	Icelandic fine-tuned Whisper, Whisper Large v3
ML stack	PyTorch, HuggingFace Transformers 5.x
Audio	sounddevice, soundfile, scipy, subprocess ffmpeg
Package manager	uv with lockfile
Sample rate	16 kHz (Whisper requirement)
Chunk processing	Configurable 10–60 second segments
Max upload	1 GB

Privacy & Security

Audio is never transmitted over the network
Models are downloaded once from Hugging Face, then cached locally
GPT features are opt-in and only send text (not audio) to OpenAI
Temporary files are cleaned up in finally blocks
.env is excluded from version control via .gitignore
unsafe_allow_html is used only for static CSS, never user input
CI includes automated vulnerability scanning with pip-audit

Credits

Developer: Magnus Smari Smarason

Models:

OpenAI Whisper
Icelandic fine-tuned model by Carlos Daniel Hernandez Mena

Built with: Streamlit · PyTorch · Hugging Face

License

MIT — see LICENSE.

Contributing

Issues and pull requests welcome at github.com/Magnussmari/whisperSSTis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Norðlenski hreimurinn

Quick Start

Features

Prerequisites

Configuration

Project Structure

Development

Testing

CI

Technical Details

Privacy & Security

Credits

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.claude		.claude
.github/workflows		.github/workflows
.streamlit		.streamlit
archive		archive
assets		assets
docs/missions		docs/missions
scripts		scripts
tests		tests
whisperSSTis		whisperSSTis
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MODERNIZATION_PROGRESS.md		MODERNIZATION_PROGRESS.md
README.md		README.md
app.py		app.py
architecture.jsonld		architecture.jsonld
env.example		env.example
launcher.py		launcher.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Norðlenski hreimurinn

Quick Start

Features

Prerequisites

Configuration

Project Structure

Development

Testing

CI

Technical Details

Privacy & Security

Credits

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages