🦉 psibot: Always-On AI Agent for Telegram

Your own personal AI assistant that runs 24/7 on your Mac. Powered by Claude Agent SDK + your Max subscription. $0 API costs.

A persistent, multimodal AI assistant that runs on your own hardware as a macOS daemon. Chat through Telegram with voice, images, and text. Manage scheduled tasks through a web dashboard. Let it work autonomously while you sleep.

Built on the Claude Agent SDK — authenticates via OAuth with your existing Claude Max subscription. No API keys, no per-token billing, no surprise costs. If you're already paying for Max, psibot is effectively free to run.

How is this different from OpenClaw? OpenClaw is a general-purpose AI assistant framework. psibot is purpose-built for a single user running Claude on their own Mac — optimized for Telegram, designed as an always-on daemon, and requires zero API spend beyond your existing Max plan.

✨ Key Features

$0 API Costs — Uses your Claude Max subscription via OAuth. No API keys, no per-token billing. Run Opus, Sonnet, and Haiku across all agents at no additional cost.

Always-On Daemon — Runs as a macOS LaunchAgent. Survives reboots, handles wake/sleep cycles, stays connected to Telegram 24/7, and runs autonomous tasks on a schedule.

Telegram-Native — First-class Telegram bot with voice messages, photo understanding, text-to-speech replies, and inline command menus. Not a web wrapper — a real bot experience.

Multimodal AI — Generates and edits images (Gemini), speaks with a neural voice (Edge TTS), transcribes voice messages (parakeet STT), and analyzes YouTube videos with semantic search.

Persistent Memory — Maintains knowledge files, daily logs, and structured memory across sessions. Learns about you over time and remembers context between conversations.

Autonomous Subagents — Spawns specialized agents: a coder (isolated git worktrees), a researcher (browser automation), an image generator, and an audio processor — each on the optimal model.

Intelligent Inbox — Captures content from Chrome bookmarks, Reddit saves, GitHub stars, and Telegram messages. A multi-phase heartbeat pipeline triages items with value-extraction, enriches them with web research, detects thematic clusters, and surfaces a prioritized digest with action buttons.

On-Demand Research — /research for quick scans (GLM + web search) or /research deep for full Claude-powered analysis. Research notes are saved to NotePlan with bidirectional wikilinks connecting related topics.

Progressive Autonomy — Learns from your feedback on digest items. Starts manual, earns trust through consistent agreement, and gradually auto-archives or auto-researches items matching learned patterns. Resets to manual on any override.

Scheduled Tasks — Cron-based job scheduling with budget controls. Periodic maintenance, reminders, or any recurring prompt with configurable quiet hours.

MCP Tool Ecosystem — Extensible via Model Context Protocol servers. Built-in tools for memory, browser automation, Telegram media, git worktrees, YouTube analysis, and more. Supports a secondary GLM backend (Z.AI) with web search, page reading, and GitHub repo analysis MCP servers.

Web Dashboard — HTMX + SSE streaming interface for real-time chat, job management, memory browsing, and log viewing.

🎨 Image Editing	🌐 Browser Automation	🔊 Agentic Audio

Generate • Edit • Send	Navigate • Read • Research	Transcribe • Speak • Listen

💡 Why Claude Agent SDK + Max?

Most AI agent frameworks (OpenClaw, nanobot, etc.) require API keys and charge per-token. If you're already paying for Claude Max, that's wasted money. psibot takes a different approach:

	API-based agents	psibot (Max subscription)
Authentication	API key management	OAuth via `claude` CLI
Cost model	Pay per token	Fixed monthly subscription
Bill risk	Uncapped, usage-dependent	Zero additional cost
Model access	Depends on tier/budget	Opus, Sonnet, Haiku — all included
Setup	Generate keys, set budgets, monitor spend	`claude login` and go

🏗️ Architecture

🚀 Quick Start

Prerequisites

macOS (Apple Silicon or Intel)
Xcode Command Line Tools (xcode-select --install)
Claude CLI (npm install -g @anthropic-ai/claude-code) — authenticated with claude login
A Telegram bot token

Automated Setup

The setup script handles everything from a fresh clone: Homebrew, Bun, dependencies, .env configuration, CLI linking, and daemon installation.

git clone https://github.com/DmacMcgreg/psibot.git
cd psibot
bash scripts/setup.sh

The script will:

Install Homebrew (if missing), then bun, sqlite, and yt-dlp
Install uv (Python tool runner)
Run bun install for node packages
Create .env from template and prompt for your Telegram bot token and user IDs
Link the psibot CLI and install the macOS LaunchAgent daemon
Optionally install edge-tts, mlx-audio, and tailscale

After setup, start the daemon:

psibot start

Uninstall

To fully remove psibot (daemon, CLI, and optionally data/dependencies):

bash scripts/uninstall.sh

Manual Setup

If you prefer to set things up manually

1. Clone and install

git clone https://github.com/DmacMcgreg/psibot.git
cd psibot
bun install
bun link          # Makes the 'psibot' command available globally

2. Configure

cp .env.example .env

Edit .env with your settings:

TELEGRAM_BOT_TOKEN=123456:ABC-DEF...
ALLOWED_TELEGRAM_USER_IDS=123456789
PORT=3141
DEFAULT_MODEL=claude-opus-4-6

3. Run

# Development (with hot reload)
bun run dev

# Production
bun run start

4. Deploy as daemon (macOS)

psibot install   # Install LaunchAgent
psibot start     # Start the daemon
psibot status    # Check status
psibot logs      # Tail logs

Optional Dependencies

Tool	Purpose	Install
uv	Python tool runner for audio tools	`curl -LsSf https://astral.sh/uv/install.sh \| sh`
mlx-audio	STT (parakeet) on Apple Silicon	`uv tool install mlx-audio`
edge-tts	Text-to-speech via Microsoft Edge neural voices	`pip install edge-tts`
Gemini API key	Image generation via Gemini	Set `GEMINI_API_KEY` in `.env`
Tailscale	Remote access to web dashboard + Funnel for webhooks + Wake-on-LAN packets	Install from tailscale.com/download

⚙️ Configuration

Environment Variables

Variable	Default	Description
`TELEGRAM_BOT_TOKEN`	(required)	Telegram bot API token
`ALLOWED_TELEGRAM_USER_IDS`	(required)	Comma-separated authorized user IDs
`PORT`	`3141`	Web dashboard port
`DEFAULT_MODEL`	`claude-opus-4-6`	Model for the main agent
`DEFAULT_MAX_BUDGET_USD`	`1.00`	Max cost per agent run
`HEARTBEAT_ENABLED`	`true`	Enable periodic autonomous heartbeat
`HEARTBEAT_INTERVAL_MINUTES`	`30`	Minutes between heartbeats
`HEARTBEAT_QUIET_START`	`23`	Quiet hours start (hour, 24h)
`HEARTBEAT_QUIET_END`	`8`	Quiet hours end (hour, 24h)
`HEARTBEAT_MAX_BUDGET_USD`	`0.50`	Max cost per heartbeat run
`PSIBOT_DIR`	`~/.psibot`	Worktree and repo storage
`GLM_AUTH_TOKEN`	(optional)	Z.AI API token for GLM backend (triage, quick scan, themes)
`GLM_BASE_URL`	`https://api.z.ai/api/anthropic`	GLM API base URL
`REDDIT_USERNAME`	(optional)	Reddit username for API User-Agent
`YOUTUBE_CLIENT_ID`	(optional)	Google OAuth client ID for YouTube
`YOUTUBE_CLIENT_SECRET`	(optional)	Google OAuth client secret
`YOUTUBE_SOURCE_PLAYLIST_ID`	(optional)	Playlist to process videos from
`YOUTUBE_DESTINATION_PLAYLIST_ID`	(optional)	Playlist to move processed videos to
`GEMINI_API_KEY`	(optional)	Gemini API key for image gen + video embeddings

Webhook Mode (Optional)

For reliable message delivery through network changes and sleep/wake cycles, enable webhook mode via Tailscale Funnel:

TELEGRAM_WEBHOOK_ENABLED=true
TELEGRAM_WEBHOOK_HOST=your-machine.tailnet-name.ts.net
TELEGRAM_WEBHOOK_PORT=8443

Heartbeat Pipeline

The heartbeat orchestrator runs every 30 minutes (configurable) and executes a multi-phase pipeline:

Intake — Triages pending items using GLM value-extraction (technique, tool, actionable, or drop)
Quick Scan — Enriches top items with external context via web search (general) or zread (GitHub repos)
Signal Scoring — Scores items against your dependencies, workflow gaps, momentum, and decay signals
Inbox Watcher — Checks NotePlan inbox for user-tagged notes (research, watch) and dispatches actions
Theme Clustering — Groups related items into auto-named themes via GLM batch analysis
Surfacing — Sends a Telegram digest with per-item action buttons (Research, Watch, Archive, Drop)

Items scoring above threshold (dependency match + workflow gap) are auto-queued for deep research. Each user action feeds back into the progressive autonomy system.

Requires GLM_AUTH_TOKEN for triage, quick scan, and theme clustering. Without it, the pipeline skips GLM-dependent phases.

YouTube Video Processing (Optional)

psibot can analyze YouTube videos — extracting transcripts, generating structured summaries with Claude, and storing vector embeddings for semantic search. Transcripts are pulled via yt-dlp (no API quota), while playlist management uses the YouTube Data API via OAuth.

1. Create Google OAuth credentials

Go to Google Cloud Console
Create a project (or select an existing one)
Enable the YouTube Data API v3 under APIs & Services > Library
Go to APIs & Services > Credentials > Create Credentials > OAuth client ID
Application type: Web application
Add an authorized redirect URI:
- With Tailscale Funnel: https://your-machine.tailnet-name.ts.net/auth/youtube/callback
- Local only: http://127.0.0.1:3141/auth/youtube/callback (use your PORT value)
Copy the Client ID and Client Secret

If your app is in "Testing" mode on the OAuth consent screen, add your Google account as a test user.

2. Add credentials to `.env`

YOUTUBE_CLIENT_ID=your-client-id.apps.googleusercontent.com
YOUTUBE_CLIENT_SECRET=your-client-secret
GEMINI_API_KEY=your-gemini-key   # Required for vector embeddings

3. Authorize the app

With psibot running, ask the agent to start YouTube OAuth setup (or use the youtube_oauth_setup tool). It will return a Google authorization URL. Open it in your browser, grant access, and the callback saves tokens to ~/.psibot/youtube-oauth.json. Tokens auto-refresh; you only need to do this once.

4. Get playlist IDs

Playlist IDs are the string after list= in a YouTube playlist URL:

https://www.youtube.com/playlist?list=PLxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                                      This is the playlist ID

Add them to .env:

YOUTUBE_SOURCE_PLAYLIST_ID=PLxxxxx      # Videos to process (e.g. "Watch Later")
YOUTUBE_DESTINATION_PLAYLIST_ID=PLyyyyy  # Where processed videos are moved

The agent processes videos from the source playlist, analyzes them, and moves them to the destination. You can also analyze individual videos by URL without any playlist configuration.

Agent tools

Tool	Description
`youtube_summarize`	Analyze a single video by URL or ID
`youtube_search`	Semantic search across all stored video analyses
`youtube_list`	List stored videos with keyword/channel filters
`youtube_get`	Get full analysis for a specific video
`youtube_process_playlist`	Batch-process videos from source playlist
`youtube_playlist_status`	Show processing stats and pending videos

💬 Telegram Commands

Command	Description
`/ask <prompt>`	Send a message to the agent
`/research <url\|id>`	Quick scan research on a URL or inbox item
`/research deep <url\|id>`	Full deep research with NotePlan note creation
`/new`	Start a fresh conversation session
`/sessions`	List and resume previous sessions
`/fork <id>`	Fork an existing session into a new conversation
`/jobs`	List scheduled jobs with inline controls
`/memory`	Browse agent memory
`/status`	Show system status
`/model <name>`	Switch model (opus, sonnet, haiku)
`/verbose`	Toggle tool call feedback

Send voice messages for automatic transcription and response. Send photos with optional captions for image-aware conversations. Reply to research messages to ask follow-up questions with full context.

📁 Project Structure

src/
  index.ts                  # Entry point
  config.ts                 # Zod-validated env config
  agent/
    index.ts                # AgentService (query with MCP + subagents)
    tools.ts                # agent-tools MCP server
    media-tools.ts          # media-tools MCP server
    glm-mcp.ts              # Z.AI MCP servers (web search, zread)
    subagents.ts            # Subagent definitions
    prompts.ts              # System prompt builder
  telegram/
    index.ts                # Bot setup + auth middleware
    commands.ts             # Command & media handlers
    keyboards.ts            # Inline keyboards + callback handlers
    format.ts               # Message formatting
    webhook.ts              # Webhook mode (Tailscale Funnel)
  heartbeat/
    index.ts                # Orchestrator pipeline (intake -> scan -> surface)
    signals.ts              # Contextual intelligence signal scorer
    inbox-watcher.ts        # NotePlan inbox tag-based actions
    themes.ts               # Automatic theme clustering via GLM
    autonomy.ts             # Progressive autonomy learning loop
  research/
    index.ts                # Deep research (GLM quick + Claude full)
    quick-scan.ts           # Platform-aware enrichment (zread, web search)
    knowledge-linker.ts     # Bidirectional wikilinks for research notes
  triage/
    index.ts                # Value-extraction triage via GLM
  scheduler/
    index.ts                # Cron + one-off job scheduling
    executor.ts             # Job execution via agent
  memory/
    index.ts                # Knowledge files, search, daily logs
  browser/
    index.ts                # Browser automation wrapper
  db/
    index.ts                # SQLite (WAL mode)
    schema.ts               # Migrations
    queries.ts              # Prepared statements
  web/
    index.ts                # Hono app + IP allowlist
    routes/                 # Chat, jobs, memory, logs, mini-app
    views/                  # HTMX templates
  shared/
    types.ts                # Type definitions
    logger.ts               # Timestamped logging
extensions/
  psibot-capture/           # Chrome extension for X bookmarks capture
knowledge/
  IDENTITY.md               # Agent persona
  USER.md                   # Learned user context
  TOOLS.md                  # Tool documentation
  HEARTBEAT.md              # Orchestrator pipeline reference
  memory.md                 # Persistent memory
  memory/                   # Daily logs
data/
  app.db                    # SQLite database
  images/                   # Generated images
  audio/                    # TTS output
  media/                    # Inbound Telegram media

🧱 Stack

Component	Technology
Runtime	Bun
Agent	@anthropic-ai/claude-agent-sdk
Bot	grammy
Web	Hono + HTMX + SSE
Database	SQLite (bun:sqlite, WAL mode, FTS)
Scheduling	croner
Validation	Zod
Image Gen	Gemini API
TTS	Edge TTS (Sonia British neural voice)
STT	parakeet (via mlx-audio)
Browser	agent-browser

Notes

macOS Full Disk Access: If the project lives in ~/Documents (or ~/Desktop, ~/Downloads), Bun needs Full Disk Access. Grant it in System Settings > Privacy & Security > Full Disk Access, then add the Bun binary (typically /opt/homebrew/bin/bun). Without this, the daemon will fail with TCC permission errors.
mlx-audio PATH: uv tool install mlx-audio places commands in ~/.local/bin/. The launcher script includes this in PATH automatically, but your interactive shell also needs it — uv adds it to your shell profile during installation.
LaunchAgent quirk: The launcher script uses bun --cwd instead of plist WorkingDirectory to avoid a Bun getcwd() deadlock under launchd
PATH for launchd: The launcher exports ~/.local/bin and /opt/homebrew/bin — needed for mlx-audio commands and the claude CLI (Agent SDK OAuth)

Acknowledgments

Inspired by OpenClaw and nanobot. psibot started as an experiment to see how far you could push the Claude Agent SDK with just a Telegram bot and a Max subscription.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.claude/skills		.claude/skills
.story		.story
docs		docs
extensions/psibot-capture		extensions/psibot-capture
knowledge		knowledge
public		public
scripts		scripts
skills		skills
src		src
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
CLAUDE.md		CLAUDE.md
README.md		README.md
agent_browser.gif		agent_browser.gif
agentic_audio.gif		agentic_audio.gif
agentic_audio.mp4		agentic_audio.mp4
architecture.png		architecture.png
bun.lock		bun.lock
image_editing.gif		image_editing.gif
package.json		package.json
psibot_logo.jpg		psibot_logo.jpg
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

🦉 psibot: Always-On AI Agent for Telegram

✨ Key Features

💡 Why Claude Agent SDK + Max?

🏗️ Architecture

🚀 Quick Start

Prerequisites

Automated Setup

Uninstall

Manual Setup

1. Clone and install

2. Configure

3. Run

4. Deploy as daemon (macOS)

Optional Dependencies

⚙️ Configuration

Environment Variables

Webhook Mode (Optional)

Heartbeat Pipeline

YouTube Video Processing (Optional)

1. Create Google OAuth credentials

2. Add credentials to .env

3. Authorize the app

4. Get playlist IDs

Agent tools

💬 Telegram Commands

📁 Project Structure

🧱 Stack

Notes

Acknowledgments

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

2. Add credentials to `.env`

Packages