Skip to content

Kvadratni/Aquarius

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

69 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Aquarius 🌊

A native macOS AI agent that lives on your screen.

Aquarius is an embodied AI assistant that manifests as a floating, animated "ghost" β€” a living Metal shader that breathes, pulses, and reacts to your voice. It can see your screen, control your interface, and handle complex tasks while you focus on what matters.

Aquarius Logo

✨ Features

πŸ—£οΈ Natural Voice Interaction

  • Push-to-Talk: Hold Control + G to speak β€” the ghost awakens and listens
  • Real-time voice conversations powered by Gemini Live
  • Bidirectional audio streaming with low latency

🧠 Dual-Brain Architecture

Brain Model Purpose
Fast Brain Gemini 2.5 Flash (Native Audio) Instant conversational responses
Deep Brain Gemini 3 Pro Complex reasoning & analysis

πŸ‘οΈ Vision & Awareness

  • Screen Sharing: Say "share my screen" and the agent can see what you see
  • Visual Halo: A subtle glow around your screen indicates when sharing is active
  • Real-time screen analysis for context-aware assistance

πŸ› οΈ Extensive Tool System

Aquarius uses a discoverable tool architecture β€” just describe what you need in natural language:

  • "Run npm install" β†’ finds and uses shell tool
  • "Open Safari" β†’ finds and uses open_application tool
  • "Edit the config file" β†’ finds and uses text_editor tool
  • "Click on the submit button" β†’ finds and uses click_mouse tool

Available Tools:

Developer Tools

  • shell β€” Execute terminal commands
  • text_editor β€” View, create, and edit files with precise replacements
  • analyze_code β€” Extract symbols, functions, classes from codebases
  • search_files β€” Grep through directories
  • list_directory / find_files β€” Navigate filesystems

Automation Tools

  • click_mouse / type_text β€” Control mouse and keyboard
  • run_applescript β€” Native macOS automation
  • open_application / close_application / focus_application β€” App management
  • find_element_coordinates β€” UI element discovery

Memory System

  • save_memory β€” Persist information long-term
  • search_memory β€” Recall stored knowledge
  • update_core_profile β€” Store user preferences & facts

Background Tasks & Task Queue

  • Long-running research and analysis tasks
  • Async execution with completion notifications
  • Configurable tool access levels
  • Idle-triggered execution β€” queued tasks run when you're inactive
  • Scheduled tasks β€” "remind me in 10 minutes", "check this at 3pm"
  • Repeatable tasks β€” daily, weekly, or custom intervals
  • Beautiful task overlay β€” ask "what's in your task list?" for an ethereal UI

⏰ Task Queue & Scheduling

  • "Remind me to check the build in 10 minutes" β†’ schedules a reminder
  • "Research competitor pricing when I'm not busy" β†’ queues for idle time
  • "Check my email every morning at 9am" β†’ creates repeatable task
  • "What's in your task list?" β†’ shows ethereal overlay with all tasks

Task Priorities:

Priority Behavior
Immediate Runs now in background
When Idle Waits for user inactivity
Scheduled Runs at specific time
Reminder Notifies user at specific time
Repeatable Runs at intervals (daily/weekly/custom)

Task Management:

  • "Cancel that reminder" β†’ uses cancel_scheduled tool
  • "Stop the research task" β†’ uses cancel_task tool
  • "What tasks are pending?" β†’ uses get_tasks to list all tasks with IDs
  • "Show me my tasks" β†’ displays ethereal overlay via show_task_list

🧠 Self-Learning Agent

Aquarius learns from your conversations, behavior, and connected services to become more helpful over time:

  • Pattern Detection: Extracts topics from conversations using NLP
  • Time-Based Learning: Notices when you ask about certain things
  • Recurring Interests: Tracks topics you frequently discuss
  • Proactive Suggestions: Offers help based on learned patterns
  • Automation Suggestions: Proposes tasks that could be automated
  • MCP Exploration: Proactively explores connected services (email, calendar, files) to discover patterns
  • Semantic Deduplication: Uses AI to merge similar patterns, keeping your learning store clean

Learning Sources:

Source What's Learned
Conversations Topics, preferences, workflows from your chats
MCP Tools Patterns from email, calendar, files (e.g., meeting habits, frequent contacts)
Ambient Audio Optionally learns from meetings via on-device transcription

How Learnings Are Used:

Method What's Included When
Auto-injected in context High-confidence patterns only (β‰₯75%) Every conversation
get_patterns tool All patterns (any confidence) On request
Proactive suggestions Contextually relevant patterns During idle time

Learning Examples:

  • "What have you learned about me?" β†’ shows all learned patterns via get_patterns
  • "Forget everything you've learned" β†’ clears all patterns
  • User frequently asks about weather at 8am β†’ Aquarius may proactively mention weather
  • MCP exploration finds unread emails pile up on weekends β†’ suggests automation

Privacy-First:

  • All learning happens locally on your Mac
  • No data leaves your device
  • Toggle learning on/off in Settings
  • Toggle MCP exploration separately in Settings
  • View and delete learned patterns anytime
  • Configure idle threshold for learning in Settings
  • Block patterns you don't want re-learned

πŸ’‘ Proactive Suggestions

Aquarius can proactively offer helpful suggestions based on learned patterns:

Suggestion Types:

Type Example
Automation "You manually check email every morning β€” want me to summarize it for you?"
Prediction "You have a meeting in 30 minutes β€” want me to prep?"
Reminder "You usually review PRs around this time"
Contextual "I see you're in Xcode β€” need help with Swift?"

Smart Delivery:

  • Only suggests when you're not actively engaged
  • Rate-limited (configurable, default 2/hour max)
  • Never repeats similar suggestions within 24 hours
  • Tracks delivered messages to prevent duplicates across restarts
  • Minimum confidence threshold (configurable)

Natural Integration:

  • Suggestions are delivered as natural conversation, not intrusive alerts
  • Aquarius stays silent if a suggestion isn't relevant or was recently mentioned
  • You can configure suggestion types in Settings

Settings:

  • Toggle proactive suggestions on/off
  • Adjust suggestions per hour (1-5)
  • Set confidence threshold (50%-90%)
  • Enable/disable automation vs prediction suggestions

🧹 Intelligent Data Management

Aquarius automatically manages its data storage to prevent disk bloat:

Memory Cleanup:

  • Memories with time-sensitive content ("tomorrow", "next week") auto-expire
  • Unused memories gradually decay in relevance
  • Old, low-relevance memories are summarized using AI
  • Duplicate memories are automatically merged

Conversation Compaction:

  • Recent conversations (7 days) kept in full
  • Older conversations summarized into insights
  • Summaries preserved as learning patterns

Log Rotation:

  • Daily log files with 7-day retention
  • 5MB size limit per file with automatic rotation

πŸ”Œ MCP (Model Context Protocol) Support

Extend Aquarius with external tools via MCP servers:

  • Standardized Integration: Connect any MCP-compatible server
  • Tool Discovery: MCP tools are automatically discoverable
  • Visual Output: Servers with UI can show WebView popups
  • Proactive Exploration: Aquarius explores connected MCPs to learn patterns

Example MCP Servers:

  • Gmail/Email servers for inbox insights
  • Google Calendar for scheduling patterns
  • Filesystem server for file operations
  • Slack/Communication tools for collaboration patterns
  • Database connectors
  • Custom tool servers

MCP Exploration (Proactive Learning):

When idle, Aquarius can automatically explore your connected MCP servers to discover patterns and automation opportunities:

Phase What Happens
Phase 1: Individual Exploration Each MCP gets its own agent with only its tools (parallel)
Phase 2: Cross-MCP Analysis All reports analyzed together for patterns

This enables insights like:

  • "You have 47 unread emails, mostly from automated systems"
  • "Meeting invites tend to arrive on Tuesdays"
  • "Files in ~/Projects haven't been accessed in 30 days"

Toggle MCP exploration in Settings β†’ Learning β†’ MCP Exploration.

Transport Types:

  • stdio β€” Local process (npx, python, node)
  • http β€” Remote HTTP server with SSE streaming
  • sse β€” Server-Sent Events

Configuration: Add servers via Settings β†’ MCP tab, or edit ~/Library/Application Support/Aquarius/mcp_servers.json:

{
  "servers": {
    "filesystem": {
      "transport": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "~/Documents"],
      "enabled": true
    },
    "remote-api": {
      "transport": "http",
      "url": "https://mcp.example.com/api",
      "headers": {"Authorization": "Bearer token"},
      "enabled": true
    }
  }
}

🎨 The Ghost

A beautiful, breathing visualization powered by Metal shaders:

  • Red β€” Listening to your voice
  • Purple β€” Speaking a response
  • Orange β€” Working on a task
  • Idle β€” Small dot, always present

πŸ“‹ Requirements

πŸš€ Getting Started

Option A: Download Pre-built Release

  1. Download the latest Aquarius-unsigned.zip from Releases
  2. Unzip the archive
  3. Remove quarantine (required for unsigned apps):
    xattr -cr ~/Downloads/Aquarius.app
  4. Open the app normally

⚠️ The app is unsigned, so macOS Gatekeeper will block it without the xattr command above.

Option B: Build from Source

1. Clone the Repository

git clone git@github.com:Kvadratni/Aquarius.git
cd Aquarius

2. Generate the Xcode Project (Optional)

If you have xcodegen installed:

brew install xcodegen
xcodegen generate

3. Build & Run

Option A: Xcode GUI

open Aquarius.xcodeproj

Then build and run with Cmd + R in Xcode.

Option B: Command Line

# Clean build (removes previous build artifacts)
rm -rf build && xcodebuild -scheme Aquarius -destination 'platform=macOS' -derivedDataPath build build

# Open the built app
open build/Build/Products/Debug/Aquarius.app

4. First Launch Setup

  1. Enter your Gemini API Key in the setup window
  2. Grant Permissions when prompted:
    • Microphone β€” Required for voice input
    • Screen Recording β€” Required for screen sharing feature
    • Accessibility β€” Only needed for UI automation tools (mouse/keyboard control)

⌨️ Keyboard Shortcuts

Shortcut Action
Control + G Toggle listening mode

πŸ“‹ Logging

Logs are written to a rotating log system:

~/Library/Application Support/Aquarius/Logs/aquarius-YYYY-MM-DD.log

Features:

  • Daily rotation with date-based filenames
  • Auto-rotates when file exceeds 5MB
  • Automatically deletes logs older than 7 days

To tail today's logs in real-time:

tail -f ~/Library/Application\ Support/Aquarius/Logs/aquarius-$(date +%Y-%m-%d).log

πŸ” Permissions

Aquarius requires these permissions to function:

Permission Why It's Needed
Microphone Voice input for conversations
Screen Recording Screen sharing with the AI
Accessibility UI automation (mouse/keyboard control) β€” only needed if using automation tools

πŸ’‘ The global hotkey (Control + G) works without any permissions using macOS's Carbon API.

⚠️ The app runs outside the sandbox to enable global hotkey detection and system-wide automation.

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     Aquarius App                            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚  β”‚   Ghost     β”‚    β”‚   Audio     β”‚    β”‚   Vision    β”‚     β”‚
β”‚  β”‚  (Metal)    β”‚    β”‚  Manager    β”‚    β”‚  Service    β”‚     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚         β”‚                  β”‚                  β”‚             β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚
β”‚                            β”‚                                β”‚
β”‚                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚                   β”‚   AgentState    β”‚                       β”‚
β”‚                   β”‚  (ViewModel)    β”‚                       β”‚
β”‚                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β”‚                            β”‚                                β”‚
β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”‚
β”‚         β”‚                  β”‚                  β”‚             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”     β”‚
β”‚  β”‚   Gemini    β”‚    β”‚  Reasoning  β”‚    β”‚    Tool     β”‚     β”‚
β”‚  β”‚    Live     β”‚    β”‚   Service   β”‚    β”‚  Registry   β”‚     β”‚
β”‚  β”‚ (Fast Brain)β”‚    β”‚(Deep Brain) β”‚    β”‚             β”‚     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

Aquarius/
β”œβ”€β”€ AquariusApp.swift       # App entry point & window management
β”œβ”€β”€ AgentState.swift        # Central state management & idle detection
β”œβ”€β”€ ContentView.swift       # Ghost visualization
β”œβ”€β”€ FloatingPanel.swift     # Transparent floating window
β”œβ”€β”€ ScreenHaloView.swift    # Screen sharing indicator
β”œβ”€β”€ TaskListOverlayView.swift # Ethereal task list UI
β”œβ”€β”€ TaskListPanel.swift     # Floating panel for task overlay
β”œβ”€β”€ MCPUIPanel.swift        # WebView panel for MCP UI output
β”œβ”€β”€ MCPSettingsTab.swift    # MCP server management UI
β”œβ”€β”€ Services/
β”‚   β”œβ”€β”€ GeminiLiveService.swift    # WebSocket voice connection
β”‚   β”œβ”€β”€ ReasoningService.swift     # Deep thinking integration
β”‚   β”œβ”€β”€ ToolRegistry.swift         # Discoverable tools system
β”‚   β”œβ”€β”€ TaskManager.swift          # Task queue, scheduling & execution
β”‚   β”œβ”€β”€ LearningStore.swift        # Persistent pattern storage
β”‚   β”œβ”€β”€ ObservationService.swift   # Behavior learning & analysis
β”‚   β”œβ”€β”€ MemoryCleanupService.swift # Intelligent memory cleanup & summarization
β”‚   β”œβ”€β”€ ScreenContextService.swift # Screen OCR & window tracking
β”‚   β”œβ”€β”€ ConceptAnalysisService.swift # LLM concept extraction
β”‚   β”œβ”€β”€ MCPClient.swift            # JSON-RPC client for MCP servers
β”‚   β”œβ”€β”€ MCPManager.swift           # MCP server coordination
β”‚   β”œβ”€β”€ MCPTypes.swift             # MCP data models
β”‚   β”œβ”€β”€ MCPExplorationService.swift # Proactive MCP exploration for learning
β”‚   β”œβ”€β”€ DeveloperService.swift     # Code & file operations
β”‚   β”œβ”€β”€ ActionService.swift        # Mouse/keyboard automation
β”‚   β”œβ”€β”€ AccessibilityService.swift # UI element discovery
β”‚   β”œβ”€β”€ VisionService.swift        # Screen capture
β”‚   β”œβ”€β”€ MemoryService.swift        # Long-term memory with decay
β”‚   β”œβ”€β”€ LogManager.swift           # Rotating log system
β”‚   β”œβ”€β”€ AudioManager.swift         # Mic & speaker handling
β”‚   └── ...
└── Resources/
    β”œβ”€β”€ Shaders.metal       # Ghost visual effects
    └── logo.png

🀝 Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

πŸ“„ License

MIT License β€” see LICENSE for details.


Built with πŸ’œ for macOS

About

Realtime Voice agent for your Mac powered by Gemini.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors