🍋‍🟩 Robota

Your meetings, your machine, no exceptions.

A macOS menu bar app that records, transcribes, and summarizes your meetings — entirely on your device. Robota uses Apple's on-device SpeechAnalyzer for transcription and Apple Intelligence for summaries, so your words never leave your Mac. It sits quietly in your menu bar, detects when a call starts, captures both your mic and the room, and hands you structured notes the moment you hang up. No subscriptions, no data brokers, no surprises.

Requirements

macOS 26.0+ (Tahoe) — Apple Silicon or Intel
Apple Intelligence — for summaries, chat, and translation. Zero setup on eligible Macs — the on-device model is managed by the system.
Ollama (optional) — power-user alternative for longer transcripts. Install Ollama, then ollama pull llama3.2
No external model downloads needed for transcription — the system manages speech models automatically

Installation

Download

Coming soon — check Releases for the latest build.

Build from source

git clone https://github.com/worldtiki/robota.git
cd robota
swift build -c release

To create a bundled .app:

./bundle.sh

Getting Started

Launch Robota — a lime 🍋‍🟩 appears in your menu bar
Grant Screen & System Audio Recording when prompted
Grant Microphone access
Grant Speech Recognition access
Click the icon → Record — or wait for automatic meeting detection to prompt you

Why Robota?

Your conversations stay yours. Cloud-based meeting tools process your audio on remote servers — often to train models, serve ads, or comply with data requests. Robota runs everything locally. There is no server to breach, no retention policy to read, and no account to delete.
No subscription tax. Pay nothing per month, per seat, or per hour of audio. The only "cost" is the disk space for a few audio files that get deleted automatically after transcription.
Works without internet. Airplane mode? Corporate firewall? VPN that blocks third-party APIs? Robota does not care. Transcription and summarization are fully offline.
Separate streams, better notes. Rather than mixing your voice with the call audio into one recording, Robota captures them as independent files. This means cleaner speaker separation — [You] and [They] segments — without sending audio to a diarization API.
Plays well with your tools. Export polished Markdown notes directly to your Obsidian vault. Chat with your transcript using Apple Intelligence or any Ollama-compatible model. Configure everything with a plain JSON file. It fits into your workflow instead of replacing it.

Features

Recording

Lives in your menu bar, stays out of your way

No dock icon. No onboarding wizard. A small lime in your menu bar that wakes up when you need it and disappears when you don't. The floating control pill anchors to the right edge of your screen — never far, never in the way.

Automatic meeting detection

Robota watches your audio devices in the background. When a headset or external mic appears — the classic signal that a call is starting — a quiet alert slides in with a one-click prompt to start recording. Ten seconds later, if you ignore it, it goes away on its own.

Two streams, one session

Your voice and the call audio are captured separately via ScreenCaptureKit, producing two clean files: mic.caf for what you said and system.caf for what everyone else said. This gives you accurate speaker attribution without the guesswork. Works transparently with Zoom, Teams, Google Meet, and any other conferencing app — ScreenCaptureKit captures mic audio passively without interfering with active calls.

Pause & resume

Pause recording mid-call without stopping the session. The timer pauses, audio buffers are skipped, and the menu bar shows ⏸. Resume when you're ready — the elapsed time accounts for the gap.

Echo cancellation

Hardware-level acoustic echo cancellation via AVAudioEngine — the same technology FaceTime uses. It monitors system audio output as a reference signal and subtracts it from the mic input, so remote participants' voices don't appear as false [You] segments. Enabled by default, toggleable from the control pill.

Bookmarks

Mark key moments during a call with a tap of the star button. Bookmarks appear as yellow markers in the review transcript and are included in Obsidian exports. Add optional labels to remember why a moment mattered.

Live Transcription

Real-time speaker-separated transcript

Watch your conversation transcribe in real time in the floating widget. [You] segments appear in teal, [They] segments in amber. Non-final words appear dimmed until the recognizer confirms them. All powered by Apple SpeechAnalyzer — no network hop, no latency, no monthly API bill, no model to download.

Auto-translation

Working across languages? Enable the globe toggle and each finalized transcript segment gets translated to your target language via Ollama. The original text stays visible; the translation appears below. Toggle off anytime without interrupting the recording.

After the Call

AI summaries with Apple Intelligence or Ollama

Hit "Summarize" and get a structured breakdown — Key Decisions, Action Items, Discussion Points, and Next Steps — streamed live in the widget.

Apple Intelligence (default) uses the on-device ~3B model via FoundationModels. Zero setup on eligible hardware. Best for typical-length meetings.

Ollama is the power-user alternative with 16K token context for longer transcripts. Use any model you already have pulled — llama3.2 by default, but yours to configure.

Summary recipes

Different meetings need different formats. Choose from 6 built-in recipes — General, Sales Call, 1:1, Sprint Planning, Standup, and SWE Interview — or create your own custom recipes in Settings with a name, icon, and system prompt. The recipe picker appears in the review toolbar next to the Summarize button.

Ask your meeting

Type a question — "What did we decide about the timeline?" — and get an answer with context from the full transcript. Conversation history carries over, so follow-up questions work naturally. Available during live recording and after transcription.

Search past meetings

Search across all your exported meeting notes from within the floating widget. Full-text search returns results with dates and context snippets, sorted by most recent. Click a result to read the full note with rendered markdown.

Obsidian export

One click sends your transcript and summary to your Obsidian vault as a formatted Markdown note with YAML frontmatter — date, duration, language, tags, bookmarks, all included. Auto-save after transcription is available for zero friction.

Privacy

Robota never connects to the internet. Audio is written to ~/Documents/Robota/ during recording and deleted after transcription. Transcripts are kept in memory only — they are not saved to disk. LLM processing happens on-device via Apple Intelligence or locally via Ollama on localhost:11434. There are no analytics, no telemetry, and no accounts.

Meeting audio is some of the most sensitive data you generate — business strategy, personnel discussions, client details, candid off-the-record moments. Robota starts from the position that this data belongs on your machine and nowhere else, not as a premium tier or a compliance checkbox, but as the only option.

Configuration

All settings live in ~/Library/Application Support/com.worldtiki.robota/settings.json. The file is optional — all values have sensible defaults. Settings auto-save on every change.

{
  "summarization_provider": "apple",
  "ollama_model": "llama3.2",
  "language": null,
  "live_transcript": false,
  "meeting_detection_enabled": true,
  "echo_suppression_enabled": true,
  "start_muted": false,
  "notifications_enabled": true,
  "obsidian_vault_path": "/path/to/vault",
  "obsidian_folder": "Meetings/Robota",
  "obsidian_auto_save": false,
  "obsidian_open_after_save": true,
  "translate_to_language": null,
  "custom_recipes": [],
  "last_recipe_id": "general"
}

Key	Default	Description
`summarization_provider`	`"apple"`	LLM provider: `"apple"` (Apple Intelligence, on-device) or `"ollama"` (local Ollama)
`ollama_model`	`"llama3.2"`	Ollama model for summarization and chat (only when provider is `"ollama"`)
`language`	system locale	Transcription language as a locale identifier (e.g. `"en"`, `"pt-BR"`)
`live_transcript`	`false`	Auto-expand transcript panel when recording starts
`meeting_detection_enabled`	`true`	Auto-detect meetings via audio device monitoring
`echo_suppression_enabled`	`true`	Hardware echo cancellation for cleaner transcription
`start_muted`	`false`	Start recording with microphone muted
`notifications_enabled`	`true`	macOS notifications for transcription events
`obsidian_vault_path`	`nil`	Path to Obsidian vault for note export
`obsidian_folder`	`"Meetings/Robota"`	Subfolder within vault for meeting notes
`obsidian_auto_save`	`false`	Auto-export to Obsidian after transcription
`obsidian_open_after_save`	`true`	Open the note in Obsidian after saving
`translate_to_language`	`nil`	Target language for live translation (e.g. `"English"`, `"Portuguese"`)
`custom_recipes`	`[]`	User-defined summary recipes
`last_recipe_id`	`"general"`	Last-used summary recipe, persisted across sessions

How It Works

ScreenCaptureKit (single SCStream)
  ├── .audio         → system.caf → SpeechAnalyzer (live + batch)
  └── .microphone    → mic.caf   → SpeechAnalyzer (live + batch)
                                  ↕ AVAudioEngine AEC (echo cancellation)

After recording stops:
  SpeechAnalyzer batch  → transcript (in-memory)
  Audio files           → deleted

  Transcript → Apple Intelligence → structured summary (default)
             → Ollama             → structured summary (alternative)
             → ObsidianExporter   → vault
             → chat (ask questions)

Both audio sources flow through one SCStream — system audio as .audio output, mic as .microphone output (captureMicrophone = true). ScreenCaptureKit captures mic audio passively at the system mixer level, so device switches during calls are handled transparently by the OS without interfering with Zoom, Teams, or other apps.

Contributing

Contributions are welcome! See CONTRIBUTING.md for development setup, code conventions, and how to submit changes.

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Robota.xcodeproj		Robota.xcodeproj
Robota.xcworkspace		Robota.xcworkspace
Sources/Robota		Sources/Robota
docs		docs
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md
build-app.sh		build-app.sh
bundle.sh		bundle.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🍋‍🟩 Robota

Requirements

Installation

Download

Build from source

Getting Started

Why Robota?

Features

Recording

Lives in your menu bar, stays out of your way

Automatic meeting detection

Two streams, one session

Pause & resume

Echo cancellation

Bookmarks

Live Transcription

Real-time speaker-separated transcript

Auto-translation

After the Call

AI summaries with Apple Intelligence or Ollama

Summary recipes

Ask your meeting

Search past meetings

Obsidian export

Privacy

Configuration

How It Works

Contributing

License

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🍋‍🟩 Robota

Requirements

Installation

Download

Build from source

Getting Started

Why Robota?

Features

Recording

Lives in your menu bar, stays out of your way

Automatic meeting detection

Two streams, one session

Pause & resume

Echo cancellation

Bookmarks

Live Transcription

Real-time speaker-separated transcript

Auto-translation

After the Call

AI summaries with Apple Intelligence or Ollama

Summary recipes

Ask your meeting

Search past meetings

Obsidian export

Privacy

Configuration

How It Works

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages