Monitor is an internal prototype CLI utility for Ubuntu running inside Windows 11 WSL. It is intended to capture periodic full-desktop screenshots from the Windows host, ask Codex CLI to summarize what is visible, and maintain text files that describe what the user appears to be doing over time.
The repository contains the first runnable monitoring CLI at src/monitor.sh. The implementation history and validation checklist are documented in plans/00-first-monitor-cli.md.
Monitor currently provides a Bash CLI, project documentation, contributor guidance, tests, and a Windows desktop screenshot helper at scripts/win-screenshot.
The primary workflow is:
- Run
bash src/monitor.sh. - The CLI captures a full Windows desktop screenshot once per minute by default.
- Each screenshot is sent to the local Codex CLI with image input for analysis.
- After each successful screenshot analysis, Codex CLI updates a detailed run-level
current-summary.md. - Screenshots and text summaries are saved under
.monitor/runs/. - The terminal shows a live dashboard with progress, percentage complete, countdowns, analyzer, save path, latest analysis, and current summary.
- When a bounded invocation finishes, the terminal prints the complete final detailed
current-summary.md.
For a bounded smoke test, run:
bash src/monitor.sh --onceThe default run has no count limit and continues until stopped with Ctrl-C.
ChatGPT subscription billing and OpenAI Platform API billing are separate. A ChatGPT subscription does not make OpenAI API usage free. Monitor therefore avoids direct OpenAI API calls in its first version: it does not require OPENAI_API_KEY, and it does not create API charges through platform.openai.com.
The analyzer uses codex exec --image ..., which relies on Codex CLI access through the user's ChatGPT/Codex plan. Monitor asks for detailed, evidence-dense Markdown and sends those prompts to Codex over stdin so growing run summaries do not become oversized shell arguments. Monitor also invokes a second text-only Codex pass after each successful screenshot to maintain current-summary.md. That usage is still subject to Codex plan limits and may consume Codex credits depending on account type, plan, and usage volume. It is not a zero-usage local model.
Helpful official references:
- OpenAI Help: Billing settings in ChatGPT vs Platform
- OpenAI Help: What is ChatGPT Plus?
- OpenAI Help: Using Codex with your ChatGPT plan
- OpenAI Help: Codex rate card
- OpenAI Developers: Codex CLI
Install or verify the required tools inside WSL:
bash- GNU core utilities such as
date,mkdir,mktemp, andsleep powershell.exeon PATH from WSLwslpathcodex
On a typical Ubuntu WSL install, the Linux tools can be installed with:
sudo apt update
sudo apt install -y bash coreutilsInstall Codex CLI with npm if it is not already available:
npm i -g @openai/codexThen sign in with ChatGPT:
codex loginYou can verify the local pieces with:
command -v bash date mkdir mktemp sleep powershell.exe wslpath codex
codex --versionCopy the example environment file when you want local defaults:
cp .env.example .envThe CLI loads .env automatically if it exists. Existing shell environment variables take precedence over .env values, and CLI flags take precedence over both.
Supported settings:
MONITOR_ANALYZER=codexMONITOR_INTERVAL_SECONDS=60MONITOR_OUTPUT_DIR=.monitor/runsMONITOR_CODEX_MODEL=for an optional Codex model override
Do not store OpenAI API keys in this repository. The first version is intentionally designed around Codex CLI rather than direct API usage.
.
├── .agents/skills/ # Repo-local agent workflows.
├── .codex/config.toml # Project-scoped Codex defaults.
├── plans/ # Ordered ExecPlans for substantial work.
├── scripts/ # Portable contributor utilities.
├── src/ # Runtime CLI code.
├── tests/ # Bash tests with fake capture/analyzer commands.
├── AGENTS.md # Repository-specific instructions for agents.
├── ARCHITECTURE.md # Stable system boundaries and codemap.
├── CODESTYLE.md # Source and documentation conventions.
├── DESIGN.md # Terminal UX and future UI design guidance.
├── PRODUCT.md # Current user-visible product truth.
└── ROADMAP.md # Intended product direction.
Run the CLI help:
bash src/monitor.sh --helpRun syntax and fake-command behavior tests:
bash -n scripts/win-screenshot src/monitor.sh tests/monitor_cli_test.sh
tests/monitor_cli_test.shRun a live one-shot capture and Codex analysis after codex login:
bash src/monitor.sh --onceThe live smoke test captures the host Windows desktop and consumes Codex plan usage or credits. A successful one-shot run now makes two Codex calls: one image analysis call and one text summary update call. If screenshot capture fails during a run, Monitor records a capture_failed row and a failure summary instead of aborting the whole counted run.
Use lightweight repository checks before review:
rg --files --hidden -g '!.git/**'
git diff --check
git status --shortThe screenshot helper can also be smoke-tested directly on Windows 11 WSL:
scripts/win-screenshot /tmp/monitor-smoke.pngUse ExecPlans for complex features or significant refactors. New plan files belong under plans/ and use the next ordered two-digit prefix, such as 00-first-monitor-cli.md and 01-local-ocr-analyzer.md.
ExecPlans must remain self-contained and current while work proceeds. The first implementation should continue from plans/00-first-monitor-cli.md rather than redesigning the CLI from chat history.