ArgusBot is a Python supervisor plugin for Codex CLI and Claude Code CLI:
- Main agent executes the task through the selected runner backend
- Reviewer sub-agent evaluates completion (
done/continue/blocked) - Planner sub-agent maintains a live framework view and proposes next-session objectives
- Loop only stops when reviewer says
doneand all acceptance checks pass
This solves the common "agent stopped early and asked for next instruction" problem.
Current defaults:
max_roundsdefaults to500.- Daemon child model defaults now inherit the selected backend's defaults unless you explicitly set a preset/override.
- Daemon-launched idle runs try to resume from the last saved
session_idbefore starting a fresh thread.
- Security risk: daemon-launched runs use
--yoloby default. This grants the selected backend high local execution power. Run only in trusted repositories/workspaces. - Visibility and debugging: Telegram/Feishu snippets may hide important details. If behavior looks wrong, run
argusbotin the target workspace and watch local live output/logs first. - Cost and loop risk: long-running objectives can consume significant tokens. Planner or reviewer quality can also cause repeated loops. Always set clear acceptance checks, monitor runtime, and stop/re-scope when needed.
- Credential and remote-control security: ArgusBot supports daemonized remote control through channels such as Telegram and Feishu, while daemon-launched runs may execute with high local privileges. Treat bot tokens, app secrets, and related credentials as highly sensitive. If these credentials are leaked, an unauthorized party may be able to issue remote commands that execute on your local machine or workspace. Never share tokens, never commit them to a repository, and rotate them immediately if exposure is suspected.
If you're using ArgusBot for research workflows, welcome to join our user community.
- WeChat user group: scan the QR code below
- Please note your background / use case when joining
If you want to control your main project from Telegram 24/7 with an always-on daemon, use this flow:
Prerequisites and cost notes:
- You must have your chosen backend CLI installed and authenticated first (make sure
codexorclaudeworks before runningargusbot init). - For 24/7 daemon operation, choosing
highorxhighreasoning can lead to token usage close to running one Codex session continuously for 24 hours. Plan budget carefully. mediumreasoning is usually a good quality/cost tradeoff for long-running background control.
- Clone this repo and install it in editable mode.
- Go to your target project directory (the repo you actually want to operate on).
- Run
argusbot init, choose control channel and execution backend, then complete setup prompts. - After setup, daemon starts in background and keeps running.
- Chat with your Telegram bot (
/run,/inject,/status,/stop) to control work at any time.
Example:
# 1) clone + install ArgusBot
git clone <your-ArgusBot-repo-url> ArgusBot
cd ArgusBot
python -m pip install -e .
# 2) go to your main project
cd ..
cd <your_main_project>
# 3) initialize daemon config in this project
argusbot initDuring argusbot init, first choose control channel (1. Telegram, 2. Feishu (适合CN网络环境)), then choose execution backend (1. Codex CLI, 2. Claude Code CLI), then enter the selected channel credentials. Config is persisted under .argusbot/ in your main project.
- Persistent main-agent loop with reviewer gating (
done/continue/blocked). - Planner/manager agent with live plan snapshots, workstream table, and follow-up objective proposal.
- Planner TODO board (
plan_todo.md) and explorer backlog maintained across planning sweeps. - Stall watchdog with soft diagnosis and hard restart safety window.
- Live visibility: terminal streaming, dashboard, Telegram push, typing heartbeat.
- Telegram inbound control during active run:
/inject,/status,/stop, voice/audio transcription. - Feishu inbound control during active run: text polling for
/run,/inject,/status,/stop,/plan,/review, and plain-text routing. - Always-on daemon mode for idle startup:
/runcan launch new runs when no loop is active. - Daemon follow-up prompt: after a run ends, Telegram can offer the planner's next suggested objective as a one-click continuation.
- Planner modes:
off,auto,record; setup defaults toauto. - Dual control channels for daemon: Telegram and terminal (
argusbot-daemon-ctl). - Single-word operator entrypoint:
argusbot(first run setup, later auto-attach monitor). - Token-exclusive daemon lock: one active daemon per Telegram token.
- Operator message history persisted to markdown and fed to reviewer decisions.
- Run archive persisted as JSONL with date/workspace/session metadata for resume continuity.
- Utility scripts: start/kill/watch daemon logs, plus sanitized cross-project setup examples.
ArgusBot keeps the same /run, /inject, /btw, planner, reviewer, and daemon flows across both backends.
--runner-backend codexuses Codex CLI.--runner-backend claudeuses Claude Code CLI.--runner-binselects the underlying executable path;--codex-binremains as a compatibility alias.- Copilot proxy applies only to the Codex backend.
- Claude accepts
low|medium|higheffort only, so ArgusBot mapsxhigh -> highwhen Claude is selected.
Current Codex CLI and Claude Code CLI do not expose a built-in --autoloop flag, so this repo adds a wrapper layer around their native task execution commands.
python -m venv .venv
source .venv/bin/activate
pip install -e .ArgusBot can route Codex backend calls through a local copilot-proxy checkout, so main/reviewer/planner/BTW runs can use GitHub Copilot-backed quota instead of OpenAI API billing.
Simplest setup:
argusbot initDuring argusbot init / argusbot-setup, ArgusBot will:
- auto-detect an existing proxy checkout in
~/copilot-proxy,~/copilot-codex-proxy, or~/.argusbot/tools/copilot-proxy - if you select the
copilotpreset (or explicitly enable Copilot proxy), offer to auto-install the proxy into~/.argusbot/tools/copilot-proxy
Direct CLI example:
argusbot-run \
--copilot-proxy \
--main-model gpt-5.4 \
--reviewer-model gpt-5.4 \
--plan-model gpt-5.4 \
"实现功能并跑完验证"Notes:
--copilot-proxy-diris only needed when your proxy checkout lives outside the auto-detected locations above.- When enabled, ArgusBot auto-starts
proxy.mjsif needed and injects Codex provider overrides per run, so you do not have to rewrite your global~/.codex/config.toml. - Claude backend ignores Copilot proxy settings by design.
- Prefer Copilot-supported models such as
gpt-5.4,gpt-5.2,gpt-5.1,gpt-4o,claude-sonnet-4.6,claude-opus-4.6, orgemini-3-pro-preview.
Run:
argusbotList supported features/commands:
argusbot helpBehavior:
- First run: asks you to choose control channel (
1. Telegram,2. Feishu (适合CN网络环境), default Telegram), then choose runner backend (1. Codex CLI,2. Claude Code CLI), then collects selected channel credentials, writes.argusbot/daemon_config.json, and starts daemon in the current shell directory. - Later runs: reuses config, ensures daemon is running, then directly attaches to live output.
argusbot init: stops all current ArgusBot daemons, prompts control channel + backend + credentials/model/play mode, starts daemon in background, then exits.- After
init, runargusbotto attach monitor to that background daemon. - Same terminal can control daemon/run:
/run <objective>/new/inject <instruction>/mode <off|auto|record>/btw <question>/confirm-send(when BTW attachments > 5, confirm and continue upload)/cancel-send(when BTW attachments > 5, skip upload)- BTW attachment return supports Telegram/Feishu media upload: images/photos, videos, and generic files/documents.
/plan <direction>/review <criteria>/show-main-prompt/show-plan/show-plan-context/show-review [round]/show-review-context/status,/stop,/daemon-stop- plain text auto-routes: running => inject, idle => run
Planner mode semantics:
off: disable plan agent behavior for daemon-launched runs.auto(default): planner stays enabled and daemon may propose/auto-run the next request.record: planner records markdown only; no automatic follow-up execution.
YOLO policy:
- Daemon-launched runs always use
--yoloby default.
Directly disable daemon from terminal:
argusbot daemon-stopYou can still use low-level commands when needed:
argusbot-daemon-ctl --bus-dir .argusbot/bus status
argusbot-daemon-ctl --bus-dir .argusbot/bus inject "先修测试再继续"argusbot-run \
--max-rounds 10 \
--check "pytest -q" \
"Implement feature X and keep iterating until tests pass"Common options:
--runner-backend {codex,claude}: select the execution backend--runner-bin <path>: select the backend executable path--session-id <id>: continue an existing runner session--main-model/--reviewer-model: set model(s)--planner-model: override the manager/planner model (defaults to reviewer settings when omitted)--copilot-proxy [--copilot-proxy-port 18080] [--copilot-proxy-dir /custom/path]: route Codex backend through localcopilot-proxypython -m codex_autoloop.model_catalog: list common models and presets--yolo: pass dangerous no-sandbox mode to Codex--full-auto: pass full-auto mode to Codex--state-file <file>: write round-by-round state JSON--plan-report-file <file>: write the latest planner markdown snapshot--plan-todo-file <file>: write the latest planner TODO board markdown--plan-update-interval-seconds 1800: run background planning sweeps every 30 minutes--verbose-events: print raw JSONL stream--dashboard: launch a live local web dashboard--dashboard-host 0.0.0.0 --dashboard-port 8787: expose dashboard to other devices in LAN--telegram-bot-token+--telegram-chat-id: send progress to Telegram (recommended for cross-network access)--telegram-events: choose which events are pushed (comma-separated)--telegram-live-interval-seconds 30: push live agent message deltas every 30s (only when changed)--feishu-app-id/--feishu-app-secret/--feishu-chat-id: enable Feishu notifications and control--feishu-events: choose which events are pushed to Feishu (comma-separated)--feishu-live-updates+--feishu-live-interval-seconds 30: push live agent message deltas to Feishu (only when changed)--feishu-heartbeat-interval-seconds 600: when a run is still active, send a Feishu heartbeat (typing...) every 10 minutes--feishu-control: allow Feishu inbound control (/inject,/status,/stop,/plan,/review) while loop is running--no-live-terminal: disable realtime terminal prints (default is on)--stall-soft-idle-seconds 3600: after 1h no new output, run stall sub-agent diagnosis (do not force kill)--stall-hard-idle-seconds 10800: after 3h no new output, force restart as hard safety valve--telegram-control: allow Telegram inbound control (/inject,/stop,/status) while loop is running--telegram-control-whisper: enable Telegram voice/audio transcription for control messages (default on)
Use this when running in CN network environments or when Telegram is unavailable.
Required parameters:
--feishu-app-id--feishu-app-secret--feishu-chat-id(forreceive_id_type=chat_id, this should look likeoc_xxx)
Common optional parameters:
--feishu-receive-id-type chat_id--feishu-events "loop.started,round.review.completed,loop.completed"--feishu-live-updates --feishu-live-interval-seconds 30--feishu-heartbeat-interval-seconds 600--feishu-control
Feishu group command notes:
- Commands can be sent directly as
/run,/inject,/stop, etc. - Mention-prefixed commands in groups (for example
@bot /stop) are normalized and parsed as commands.
Minimal run example:
argusbot-run \
--feishu-app-id "$FEISHU_APP_ID" \
--feishu-app-secret "$FEISHU_APP_SECRET" \
--feishu-chat-id "$FEISHU_CHAT_ID" \
"your objective"App-side enablement steps (Feishu Open Platform):
- Enable bot capability for the app.
- Grant message-related app scopes (at least one required by API):
im:message.history:readonly,im:message:readonly, orim:message. - Publish a new app version and install/update it in your tenant.
- Add the bot into the target group, then use that group's
chat_id(oc_xxx) in config.
Common errors:
230006 Bot ability is not activated: bot capability is disabled or not published/installed yet.230002 Bot/User can NOT be out of the chat: bot is not in the target group, orfeishu_chat_idpoints to a different chat.99991672 Access denied ... scopes required: required Feishu scopes are not enabled for the app.
The most important field is the final goal. Put it first.
A good objective usually has this shape:
Final Goal:
<the end state you actually want>
Current Task:
<what should be done in this session>
Acceptance Criteria:
<how the system knows it is done>
Constraints:
<repo, time, safety, cost, model, dataset, or style constraints>
Notes:
<optional hints, references, known risks, or preferred approach>
Practical guidance:
- Put
Final Goalfirst, even if the immediate task is small. - Say what “done” means in concrete terms.
- If you want planner behavior, say whether it should explore, only record, or stay off.
- If your wording is messy, you can ask any AI tool to rewrite your request into the template above before sending it here. This repo does not need to provide that rewrite step itself.
Use this only when the paper has usable open-source code or a strong public implementation.
Final Goal:
Reproduce the paper's core result well enough to run inference, complete one smoke training run, and generate a structured reproduction report in this repository.
Current Task:
Set up the repo, inspect the available code path, create the reproduction plan, wire the experiment directories, and run the minimum smoke path needed to prove the project is alive.
Acceptance Criteria:
1. The repository has a clear plan_report.md and plan_todo.md.
2. The selected implementation path is documented.
3. At least one runnable inference or smoke-training command succeeds.
4. The next highest-priority follow-up experiment is recorded.
Constraints:
1. Prefer official or high-quality open-source implementations.
2. Do not aim for full SOTA reproduction in the first session.
3. Keep the work resumable from Telegram and daemon state files.
Final Goal:
Turn this repository into a maintainable, planner-driven project where completed work, remaining work, and next-step execution suggestions are always visible.
Current Task:
Map the architecture, identify the missing module boundaries, implement the highest-leverage missing feature, and update the project reports.
Acceptance Criteria:
1. The new feature is implemented and validated.
2. Planner outputs reflect what is done and what remains.
3. The next follow-up objective is concrete enough to run as a new session.
Constraints:
1. Preserve the existing coding style.
2. Prefer small verifiable steps over large speculative rewrites.
Example with live dashboard (Do not expose yourself on the public internet! It's extremely dangerous!):
argusbot-run \
--dashboard \
--dashboard-host 0.0.0.0 \
--dashboard-port 8787 \
--check "pytest -q" \
"帮我在这个文件夹写一下pipeline"Then open http://<your-machine-ip>:8787 on phone or browser.
If phone and server are not in the same network, do not expose dashboard publicly by default. Use Telegram push notifications instead:
argusbot-run \
--max-rounds 12 \
--check "pytest -q" \
--telegram-bot-token "$TELEGRAM_BOT_TOKEN" \
--telegram-events "loop.started,round.review.completed,loop.completed" \
"帮我在这个文件夹写一下pipeline"--telegram-chat-id defaults to auto and will be resolved from getUpdates.
If auto resolve fails, send /start to bot and run again, or pass explicit chat id.
Live visibility defaults:
- Terminal prints main/reviewer agent messages in realtime.
- Telegram sends live message deltas every 30 seconds only if there are new changes.
Telegram control channel defaults:
/inject <text>or plain text message: interrupt current main-agent run and apply new instruction next round.- Voice/audio message: auto-transcribed via Whisper and treated like text input (for example spoken
/inject ...). /status: return current loop state./stop: interrupt current run and stop the loop./help: print command summary.
Whisper-related options:
--telegram-control-whisper-api-key: OpenAI API key (defaults toOPENAI_API_KEY).--telegram-control-whisper-model: transcription model (defaultwhisper-1).--telegram-control-whisper-base-url: OpenAI-compatible API base URL.--telegram-control-whisper-timeout-seconds: transcription request timeout.
Stall watchdog defaults:
- If no new output for 1 hour, sub-agent inspects the latest message/tails and decides whether restart is needed.
- If no new output reaches 3 hours, process is force-restarted regardless.
Typing heartbeat is enabled by default during execution. Disable with:
--telegram-no-typingSecurity notes:
- Keep dashboard on
127.0.0.1unless you have VPN/auth in front. - Never commit bot token to git.
- Prefer sending round summaries, not every raw log line.
Troubleshooting:
- Bot token must be full format:
<digits>:<secret>, not only the secret part. - If no message arrives, run once with
--verbose-eventsand check stderr lines prefixed with[telegram]. - If control commands are ignored, verify command comes from the same chat id resolved for notifications.
If you want Telegram to trigger runs when no loop process is active, run:
argusbot-daemon \
--telegram-bot-token "$TELEGRAM_BOT_TOKEN" \
--telegram-chat-id auto \
--run-check "pytest -q"Daemon commands from Telegram:
/run <objective>: start a newArgusBotrun/status: daemon/child status/stop: stop active run- After
/stop, use/run <objective>to continue. By default daemon resumes the lastsession_idwhen available. /help- After a child run finishes, the daemon can offer a Telegram button to execute the planner's next suggested objective.
- If the user does nothing, daemon auto-executes the planned next session after the follow-up countdown (default: 10 minutes).
- Before executing that follow-up, daemon creates a git checkpoint commit when the workspace is dirty.
- Telegram follow-up options are: direct execute, reject plan, or modify then execute while inheriting the planner objective. That follow-up starts as a fresh session.
For background mode:
TELEGRAM_BOT_TOKEN=... TELEGRAM_CHAT_ID=... ./scripts/start_telegram_daemon.shRecommendation: prefer systemd or supervisor over raw nohup for production reliability.
Run once:
argusbot-setup --run-cd .If command is not found in your environment, use:
python -m codex_autoloop.setup_wizard --run-cd .The wizard will:
- Check the selected runner CLI availability and basic auth probe.
- Prompt for control channel: Telegram, Feishu, or both.
- Prompt only for the selected channel credentials.
- Prompt optional default check command (empty means no forced check command).
- Prompt for planner mode after model selection.
- Start daemon in background and save config under
.argusbot/.
Default behavior for daemon-launched runs:
--yolois enabled by default.- No default
--checkis enforced unless you set one. - Daemon-launched runs inherit the selected backend defaults unless you explicitly set preset/overrides.
- When the daemon is idle, a new
/runor terminalruncommand will reuse the last savedsession_idif available. - One Telegram token can only be owned by one active daemon process (second daemon returns an error).
- In daemon mode, only daemon polls Telegram updates; child runs receive control via daemon bus (avoids getUpdates 409 conflicts).
- If daemon detects
invalid encrypted contentfrom a resumed run, it raises a warning and auto-arms fresh session for the next run. - Inside a running child loop,
invalid_encrypted_contentnow triggers an immediate in-loop fresh-session retry instead of spinning reviewercontinueloops. - Operator messages (initial objective + terminal/Telegram injects) are appended to a shared
.argusbot/logs/operator_messages.mdso reviewer can see global inject history across runs. - Each run also appends start/finish records into
.argusbot/logs/argusbot-run-archive.jsonl(includes date + workspace + session metadata) for continuity and auditing. - Re-running setup or start script will stop the previous daemon under the same
home-dirbefore launching the new one.
After setup, use terminal control:
argusbot-daemon-ctl --bus-dir .argusbot/bus run "帮我在这个文件夹写一下pipeline"
argusbot-daemon-ctl --bus-dir .argusbot/bus inject "先修测试再继续"
argusbot-daemon-ctl --bus-dir .argusbot/bus status
argusbot-daemon-ctl --bus-dir .argusbot/bus stop
argusbot-daemon-ctl --bus-dir .argusbot/bus daemon-stopOne-click kill script:
./scripts/kill_telegram_daemon.shRealtime log mirror:
./scripts/watch_argusbot_logs.sh .argusbotThis mirrors:
.argusbot/daemon.out.argusbot/logs/daemon-events.jsonl(Telegram/terminal control interactions)
If argusbot-daemon-ctl is not found, replace it with:
python -m codex_autoloop.daemon_ctlShow local model presets and common names:
python -m codex_autoloop.model_catalogCurrent presets:
quality:main=gpt-5.4/high,reviewer=gpt-5.4/highcopilot:main=gpt-5.4/high,reviewer=gpt-5.4/highcodex52-xhigh:main=gpt-5.2-codex/xhigh,reviewer=gpt-5.2-codex/xhighquality-xhigh:main=gpt-5.4/xhigh,reviewer=gpt-5.4/xhighbalanced:main=gpt-5.3-codex/high,reviewer=gpt-5.1-codex/mediumcodex-xhigh:main=gpt-5.3-codex/xhigh,reviewer=gpt-5.3-codex/xhighcheap:main=gpt-5.1-codex-mini/medium,reviewer=gpt-5-codex-mini/lowmax:main=gpt-5.1-codex-max/xhigh,reviewer=gpt-5.3-codex/high
To use Qwen models, you need to set the following environment variables:
# Set your Alibaba Cloud DashScope API Key
export DASHSCOPE_API_KEY="your-dashscope-api-key"
export DASHSCOPE_API_BASE="https://dashscope.aliyuncs.com/compatible-mode/v1"Get your API key from: https://dashscope.console.aliyun.com/
Qwen presets:
qwen3-quality:main=qwen3-30b-a3b/high,reviewer=qwen3-30b-a3b/high- High quality Qwen3qwen3-balanced:main=qwen3-32b/high,reviewer=qwen3-14b/medium- Balanced Qwen3qwen3-coder:main=qwen3-coder-32b/high,reviewer=qwen3-coder-14b/medium- Code-optimized Qwen3qwen3-cheap:main=qwen3-coder-7b/medium,reviewer=qwen3-8b/low- Budget Qwen3qwen25-quality:main=qwen2.5-max/high,reviewer=qwen2.5-max/high- High quality Qwen2.5qwen25-balanced:main=qwen2.5-72b/high,reviewer=qwen2.5-32b/medium- Balanced Qwen2.5qwen25-coder:main=qwen2.5-coder-32b/high,reviewer=qwen2.5-coder-14b/medium- Code-optimized Qwen2.5qwen25-cheap:main=qwen2.5-coder-7b/medium,reviewer=qwen2.5-7b/low- Budget Qwen2.5qwen35-quality:main=qwen3.5-72b/high,reviewer=qwen3.5-32b/high- High quality Qwen3.5qwen35-balanced:main=qwen3.5-32b/high,reviewer=qwen3.5-14b/medium- Balanced Qwen3.5qwen35-coder:main=qwen3.5-coder-32b/high,reviewer=qwen3.5-coder-14b/medium- Code-optimized Qwen3.5qwen35-cheap:main=qwen3.5-coder-7b/medium,reviewer=qwen3.5-8b/low- Budget Qwen3.5
Note:
gpt-5.4is the model name.highis the reasoning effort level, not part of the model name.- For always-on daemon use,
mediumis often the safer default for token cost while keeping solid quality.
Daemon overrides:
python -m codex_autoloop.setup_wizard --run-cd .The wizard now lets you choose either:
- a preset for both agents, or
- separate
main/reviewermodel names
Use this pattern when ArgusBot is cloned under a different workspace and you want the daemon to run tasks in newproject.
# Replace with your own locations (public-safe placeholders)
export WORKSPACE_ROOT="/path/to/workspace"
export LOOP_REPO="$WORKSPACE_ROOT/ArgusBot"
export TARGET_REPO="$WORKSPACE_ROOT/newproject"
cd "$TARGET_REPO"
python -m pip install -e "$LOOP_REPO"
# First-time setup (interactive)
python -m codex_autoloop.setup_wizard \
--run-cd "$TARGET_REPO" \
--home-dir "$TARGET_REPO/.argusbot"After setup:
# Terminal control (same running daemon)
python -m codex_autoloop.daemon_ctl --bus-dir "$TARGET_REPO/.argusbot/bus" status
python -m codex_autoloop.daemon_ctl --bus-dir "$TARGET_REPO/.argusbot/bus" run "run 100-step smoke and validate checkpoint+infer"
python -m codex_autoloop.daemon_ctl --bus-dir "$TARGET_REPO/.argusbot/bus" inject "fix test failures first, then continue"Telegram control in parallel:
/run <objective>/inject <instruction>/status/stop
You can add a shell function so codex --autoloop ... routes to this plugin.
codex() {
if [[ " $* " == *" --autoloop "* ]]; then
/data/yijia/ArgusBot/scripts/argusbot_shim.sh "$@"
else
command codex "$@"
fi
}Put it in ~/.bashrc or ~/.zshrc, then reload shell.
Per round:
- Run main agent in the same Codex thread (
execthenexec resume) - Run acceptance checks (
--check, repeatable) - Run reviewer sub-agent with structured JSON schema output
- Stop only if reviewer says
doneand all checks pass - Otherwise build a new continue prompt and run next round
Safety stop conditions:
max_roundsreached- repeated no-progress rounds
- reviewer returns
blocked
Contributions are welcome. See CONTRIBUTING.md for workflow, attribution, and acknowledgement details.
This project is licensed under the MIT License. See LICENSE.

.png)
