Agent guidance for working effectively in this codebase.
Stakpak is a security-hardened DevOps AI agent that runs in the terminal. It generates infrastructure code, debugs Kubernetes, configures CI/CD, and automates deployments — without giving the LLM keys to production.
- Language: Rust (edition 2024, nightly features enabled)
- License: Apache-2.0
- Repository: https://github.com/stakpak/agent
cli/ # Main binary crate (`stakpak`)
├── src/
│ ├── main.rs
│ ├── commands/
│ │ ├── agent/run/ # Agent execution engine
│ │ │ ├── mode_interactive.rs # Interactive TUI agent loop
│ │ │ ├── mode_async.rs # Async/headless mode
│ │ │ ├── stream.rs # SSE stream processing
│ │ │ ├── checkpoint.rs # Session checkpoint/resume
│ │ │ ├── tooling.rs # Tool execution
│ │ │ └── helpers.rs # Shared helpers
│ │ ├── acp/ # Agent Client Protocol (Zed integration)
│ │ ├── mcp/ # MCP server/proxy commands
│ │ ├── auth/ # Login/account commands (interactive + non-interactive setup)
│ │ ├── autopilot/ # Autopilot: init, up/down, status, schedule, channel
│ │ └── watch/ # Scheduled task runtime (internal, driven by autopilot)
│ ├── config/ # Configuration management
│ │ ├── file.rs # ConfigFile with profiles + ensure_readonly()
│ │ ├── profile.rs # ProfileConfig + readonly_profile()
│ │ └── types.rs # ProviderType (Remote/Local)
│ └── onboarding/ # Interactive setup wizard + save_config.rs
tui/ # TUI crate (ratatui-based)
├── src/
│ ├── app/events.rs # InputEvent / OutputEvent enums
│ └── services/handlers/ # Event handlers (tool, shell, etc.)
libs/
├── ai/ # LLM provider abstraction (`stakai`)
│ └── src/providers/
│ ├── anthropic/ # Anthropic API (convert, stream, types)
│ ├── openai/ # OpenAI-compatible API
│ └── gemini/ # Google Gemini API
├── api/ # API client + local processing (`stakpak-api`)
│ └── src/local/
│ ├── context_managers/ # Message history reduction strategies
│ │ ├── task_board_context_manager.rs # Preserves individual messages
│ │ ├── simple_context_manager.rs # Flattens history to text
│ │ └── file_scratchpad_context_manager.rs
│ └── hooks/ # Context hooks (scratchpad, task board)
├── shared/ # Shared types (`stakpak-shared`)
│ └── src/models/
│ ├── llm.rs # LLMMessage, LLMMessageContent, provider configs
│ ├── stakai_adapter.rs # ChatMessage → StakAI Message conversion
│ └── integrations/
│ ├── openai.rs # ChatMessage, ToolCall, Role types
│ └── mcp.rs # MCP tool call result handling
└── mcp/ # MCP client/server/proxy crates
├── client/
├── server/
└── proxy/
The autopilot system (stakpak autopilot / stakpak up) is the self-driving infrastructure mode. It runs as a system service (launchd on macOS, systemd on Linux) and manages two runtimes:
Single config file for everything — schedules, channels, and runtime settings:
[runtime]
bind = "127.0.0.1:4096"
[[schedules]]
name = "health-check"
cron = "*/5 * * * *"
prompt = "Check system health"
[channels.slack]
bot_token = "xoxb-..."
app_token = "xapp-..."stakpak up # Start autopilot (auto-inits if needed)
stakpak down # Stop autopilot
stakpak autopilot init # Explicit setup wizard
stakpak autopilot status # Health, uptime, schedules, channels
stakpak autopilot logs # Stream logs
stakpak autopilot schedule list # List schedules
stakpak autopilot schedule add <name> --cron '...' --prompt '...'
stakpak autopilot schedule remove <name>
stakpak autopilot schedule enable|disable <name>
stakpak autopilot schedule trigger <name> # Manual fire
stakpak autopilot schedule history <name>
stakpak autopilot channel list # List channels
stakpak autopilot channel add <type> --token|--bot-token|--app-token
stakpak autopilot channel remove <type>
stakpak autopilot channel test # Test connectivity
| File | Purpose |
|---|---|
cli/src/commands/autopilot.rs |
All autopilot commands, config types, schedule/channel CRUD |
cli/src/commands/watch/ |
Schedule runtime (cron engine, trigger execution, history) |
libs/gateway/ |
Channel runtime (Slack/Telegram/Discord message handling) |
libs/gateway/src/config.rs |
GatewayConfig — channel config load/save |
stakpak auth login --api-key $KEY
stakpak autopilot schedule add health --cron '0 */6 * * *' --prompt 'Check health'
stakpak autopilot channel add slack --bot-token $SLACK_BOT --app-token $SLACK_APP
stakpak upMessages flow through several transformation layers before reaching the LLM API:
User input / Tool results
↓
Vec<ChatMessage> # OpenAI-shaped messages (cli/mode_interactive.rs)
↓ sanitize_tool_results() # Dedup + remove orphans (before context manager)
↓
ContextManager::reduce_context() # History reduction (libs/api/context_managers/)
↓ merge_consecutive_same_role() # Merge tool messages
↓ dedup_tool_results() # Deduplicate within merged messages
↓ reduce_context_with_budget() # Budget-aware trimming (if over threshold)
↓
Vec<LLMMessage> # Provider-neutral messages
↓
to_stakai_message() # libs/shared/stakai_adapter.rs
↓
Vec<StakAI Message> # Internal API format
↓
build_messages_with_caching() # libs/ai/providers/anthropic/convert.rs
↓
Vec<AnthropicMessage> # Anthropic API format → HTTP request
| Type | Location | Purpose |
|---|---|---|
ChatMessage |
libs/shared/models/integrations/openai.rs |
OpenAI-shaped message (role, content, tool_calls, tool_call_id) |
LLMMessage |
libs/shared/models/llm.rs |
Provider-neutral message with typed content parts |
LLMMessageContent |
libs/shared/models/llm.rs |
Either String or List(Vec<LLMMessageTypedContent>) |
LLMMessageTypedContent |
libs/shared/models/llm.rs |
Text, ToolCall, ToolResult, Image, Document |
AnthropicMessage |
libs/ai/providers/anthropic/types.rs |
Anthropic API message format |
mode_interactive.rs runs the main agent loop:
- Receive events from TUI via
output_rx(OutputEventenum) - Process events:
UserMessage,AcceptTool,RejectTool,SendToolResult, etc. - Build message history: append to
messages: Vec<ChatMessage> - Sanitize:
sanitize_tool_results()before each API call - Send to LLM: via
client.chat_completion_stream() - Stream response: parse SSE events, extract tool calls
- Execute tools: pop from
tools_queue, send to TUI for approval
AI returns tool_calls [A, B, C]
↓
tools_queue = [B, C], send A to TUI
↓
TUI: AcceptTool(A) or RejectTool(A)
↓ (if accepted)
run_tool_call() → tokio::select! { result OR cancel_signal }
↓
Push tool_result(A) to messages
↓
Pop B from queue, send to TUI
↓ ... repeat ...
All tools done → fall through to API call
Cancel/Retry flow: When a tool is cancelled (retry/shell mode), the AcceptTool handler does NOT push a result if the queue is empty (the shell/retry flow will send SendToolResult later). If the queue is non-empty, it pushes a TOOL_CALL_CANCELLED placeholder to keep the message chain valid.
unwrap()andexpect()are denied viaclippy.tomlworkspace lints (allowed in tests)- Use
anyhow::Resultwith?operator and.context()for application code - Use
thiserrorfor library error types - Use
matchorif letforOptiontypes
- Rust edition 2024 with nightly features (
let chainsinif let) - Run
cargo fmtbefore committing - Run
cargo clippy --all-targets— warnings should be zero - Prefer
LLMMessage::fromover|msg| LLMMessage::from(msg)(clippy: redundant closure) - Collapse nested
if+if letinto combined conditions where readable - Use
std::mem::takefor efficient ownership transfer in place - Prefer importing local types/enums in tests and module code (
use ...) instead of verbosecrate::...paths in struct literals
Rust strings are UTF-8. Characters can be 1–4 bytes, so never slice with a raw byte index (&s[..80], &s[..n-3]) — it panics if the index lands mid-character. Safe approaches:
// ✅ Truncate by character count
let truncated: String = s.chars().take(80).collect();
// ✅ Validate boundary before slicing (when you need byte-position slicing)
let mut end = max_bytes;
while end > 0 && !s.is_char_boundary(end) { end -= 1; }
let truncated = &s[..end];Indices from .find() / .rfind() on the same string are always safe. See cli/src/commands/watch/commands/run.rs:truncate_string() for the canonical pattern.
- Tests live in
#[cfg(test)] mod testsat the bottom of each file - Use
#[tokio::test]for async tests - Helpers like
assistant_with_tool_calls(),tool_message()abstract test setup - Assertion helpers like
assert_no_duplicate_tool_results()encode invariants
- Context managers:
<Strategy>ContextManager(e.g.,TaskBoardContextManager) - Event enums:
InputEvent(TUI → backend),OutputEvent(backend → TUI) - Tool results:
tool_result(id, content)helper function - Functions:
snake_case, descriptive verbs (sanitize_tool_results,merge_consecutive_same_role)
- Strictly alternating roles:
user/assistantmust alternate; consecutive same-role messages are rejected (400) - Tool results:
role=toolmessages are converted torole=userwithtool_resultcontent blocks - Each
tool_useneeds exactly onetool_result: duplicates or missing results cause 400 errors tool_resultmust reference atool_usein the immediately preceding assistant message- Cache control (
ephemeralbreakpoints) is added inbuild_messages_with_caching()— not upstream
The codebase uses three layers to prevent invalid message sequences:
- Source prevention (
mode_interactive.rs): Don't push cancelled tool_results when retry will send the real one - Pre-API sanitization (
sanitize_tool_results): Dedup and remove orphans fromVec<ChatMessage>before every API call - Context manager (
task_board_context_manager.rs): Merge consecutive same-role messages and dedup tool_results in thereduce_context()pipeline
Long sessions accumulate messages that approach the context window limit. The TaskBoardContextManager implements budget-aware trimming:
- Lazy trimming: Only triggers when estimated tokens exceed
context_window × threshold(default 80%) - Stable prefix: Trimmed messages are replaced with
[trimmed]placeholders, preserving message structure (roles, tool_call_ids) for API validity - Cache-friendly: The trimmed prefix produces identical output across turns, so Anthropic's prompt cache stays valid
- Metadata persistence: Trimming state (
trimmed_up_to_message_index) is stored inCheckpointState.metadataand flows through:CheckpointState.metadata→AgentState.metadata→ Hook updates →save_checkpoint()→ persisted
Key files:
libs/api/src/local/context_managers/task_board_context_manager.rs—reduce_context_with_budget(),estimate_tokens(),trim_message()libs/api/src/local/hooks/task_board_context/mod.rs— Wires budget-aware trimming into the hook lifecyclelibs/api/src/storage.rs—CheckpointState.metadatafieldlibs/api/src/models.rs—AgentState.metadatafield
# Build
cargo build # debug
cargo build --release # release
# Test
cargo test --workspace # all tests
cargo test --workspace --lib # library tests only
cargo test --bin stakpak # binary tests only
cargo test -p stakpak-api # single crate
cargo test -- test_name # by name pattern
# Lint
cargo fmt --check
cargo clippy --all-targets
# Quick check (no codegen)
cargo checkThe stakpak auth login command supports non-interactive setup for CI/scripts:
# Stakpak API (remote provider, default)
stakpak auth login --api-key $STAKPAK_API_KEY
# Local providers (BYOK)
stakpak auth login --provider anthropic --api-key $ANTHROPIC_API_KEY
stakpak auth login --provider openai --api-key $OPENAI_API_KEY
stakpak auth login --provider gemini --api-key $GEMINI_API_KEYThis creates:
~/.stakpak/config.tomlwithdefault+readonlyprofiles~/.stakpak/auth.tomlfor local provider credentials
Full non-interactive autopilot setup:
stakpak auth login --api-key $STAKPAK_API_KEY
stakpak autopilot init --non-interactive --yes
stakpak autopilot schedule add daily-check --cron '0 9 * * *' --prompt 'Run health checks'
stakpak autopilot channel add slack --bot-token $SLACK_BOT --app-token $SLACK_APP
stakpak upKey files:
cli/src/commands/auth/login.rs—handle_non_interactive_setup()cli/src/commands/autopilot.rs—setup_autopilot(),start_autopilot(), schedule/channel CRUDcli/src/onboarding/save_config.rs—save_to_profile()+update_readonly()cli/src/config/profile.rs—readonly_profile()creates sandbox replica of default