Kubernetes communications controller for agent workspaces. Manages LLM calls and channel connections via the controller + Job pattern. Credentials never leave ephemeral Job pods.
Three components:
-
Controller -- k8s controller, one per workspace namespace. Serves gRPC. Watches
TightbeamModelandTightbeamChannelCRDs. Creates and manages LLM Jobs and Channel Jobs. Owns conversation history (PVC-backed NDJSON). -
LLM Job -- stateless Job pod. Connects to the controller via gRPC, pulls a turn assignment (long-poll), reads the API key from a kubelet-mounted Secret, calls the LLM provider, streams the response back. Session-scoped keepalive: the Job loops on
GetTurnuntil an idle timeout fires, then exits. -
Channel Job -- holds an outbound connection to a messaging platform (Discord, Slack). Bot token mounted by kubelet. Forwards inbound messages to the controller, receives agent responses and sends them to the channel.
The controller is the only gRPC server. Everything else connects back to it as a client.
AI agents running in containers need to call LLM APIs, but giving them API keys means:
- Credential exposure -- a compromised agent leaks your API key
- No audit trail -- the agent calls whatever it wants with your credentials
- No conversation control -- the agent manages its own context window
Tightbeam solves this by isolating credentials inside ephemeral Job pods. The controller never sees API keys. It references k8s Secrets by name in Job specs; kubelet mounts them into the pod. The agent runtime (Transponder) knows nothing about keys, models, or providers.
Use Airlock for MCP tool isolation. Use Tightbeam for LLM API isolation.
gRPC
Transponder ──────────────> Controller ─────> Conversation Log (PVC)
│
gRPC │ creates k8s Jobs
┌───────────────┤
│ │
LLM Job Channel Job
(api key (bot token
mounted) mounted)
│ │
v v
Anthropic API Discord/Slack
The controller watches CRDs to know which models and channels are available. When a turn arrives, it enqueues a TurnAssignment. The LLM Job pulls it via GetTurn (blocking long-poll), calls the LLM, and streams results back via StreamTurnResult. The controller appends the response to conversation history and forwards events to the caller.
Declares an available LLM model. The controller creates LLM Jobs from these.
apiVersion: tightbeam.dev/v1
kind: TightbeamModel
metadata:
name: claude-sonnet
namespace: workspace-my-ws
spec:
provider: anthropic
model: claude-sonnet-4-20250514
description: "Fast, capable model for code tasks"
maxTokens: 8192
secretName: llm-anthropic-key
image: ghcr.io/calebfaruki/tightbeam-llm-job:latest
idleTimeout: 300The secretName references a k8s Secret containing provider, model, api-key, and optionally max-tokens as individual keys. Kubelet mounts it into the LLM Job at /run/secrets/llm/.
Declares a channel connection. The controller creates Channel Jobs from these.
apiVersion: tightbeam.dev/v1
kind: TightbeamChannel
metadata:
name: discord-bot
namespace: workspace-my-ws
spec:
type: discord
secretName: discord-bot-token
image: ghcr.io/calebfaruki/tightbeam-channel-discord:latest
targetModel: claude-sonnetSingle service: tightbeam.v1.TightbeamController. Proto definition at crates/tightbeam-proto/proto/tightbeam/v1/tightbeam.proto.
| RPC | Caller | Description |
|---|---|---|
GetTurn |
LLM Job | Long-poll. Blocks until a turn is ready. Job sets gRPC deadline as idle timeout. |
StreamTurnResult |
LLM Job | Streams response chunks (content deltas, tool calls) back to the controller. |
Turn |
Transponder | Sends messages, receives streaming LLM response events. |
ListModels |
Transponder | Returns available models from CRDs. |
ChannelStream |
Channel Job | Bidirectional stream. Inbound user messages in, agent responses out. |
- Transponder calls
Turnwith new messages - Controller appends messages to conversation history
- Controller builds
TurnAssignmentfrom full history and enqueues it - LLM Job's
GetTurnresolves with the assignment - LLM Job calls the LLM provider, streams chunks via
StreamTurnResult - Controller forwards chunks as
TurnEvents on theTurnresponse stream - Controller appends assistant message to conversation log
- If
tool_use: transponder executes tools locally, sends results in a newTurn - If
end_turn/max_tokens: turn complete
message Message {
string role = 1;
repeated ContentBlock content = 2;
repeated ToolCall tool_calls = 3;
optional string tool_call_id = 4;
optional bool is_error = 5;
optional string agent = 6;
}
message TurnAssignment {
optional string system = 1;
repeated ToolDefinition tools = 2;
repeated Message messages = 3;
ModelConfig model_config = 4;
}
message TurnResultChunk {
oneof chunk {
ContentDelta content_delta = 1;
ToolUseStart tool_use_start = 2;
ToolUseInput tool_use_input = 3;
TurnComplete complete = 4;
TurnError error = 5;
}
}ToolDefinition.parameters_json and ToolCall.input_json are JSON strings, not protobuf Struct. ImageBlock.data is raw bytes, not base64. The LLM Job handles provider-specific encoding.
The controller owns the conversation. It persists every message to NDJSON on a PVC. On restart, it rebuilds from the log.
Multi-agent support: messages carry an optional agent field. When multiple agents have contributed, history_for_provider() prefixes assistant messages with [agent_name]: so the LLM knows who said what. Router responses go to a separate router.ndjson audit log and are excluded from conversation history.
- Controller creates a k8s Job referencing the model's Secret by name
- Kubelet mounts the Secret at
/run/secrets/llm/inside the pod - Job starts, reads API key from the mounted file, connects to controller
- Job calls
GetTurn-- blocks until work arrives - Job calls LLM provider, streams response back via
StreamTurnResult - Job loops back to step 4
- If no work arrives before the gRPC deadline, Job exits
- TTL controller cleans up the completed pod after 30 seconds
- On next turn, controller creates a fresh Job if none is connected
The API key exists only in the ephemeral pod's memory and mounted tmpfs. It never appears in gRPC messages, controller memory, or Job spec env vars.
The k8s Secret referenced by TightbeamModel.spec.secretName must contain these keys:
provider -> "anthropic"
model -> "claude-sonnet-4-20250514"
api-key -> "sk-ant-..."
max-tokens -> "8192" # optional, defaults to 8192
Values are trimmed of whitespace. Missing provider, model, or api-key is a hard error.
The controller ServiceAccount has zero Secret read access:
rules:
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create", "get", "list", "watch", "delete"]
- apiGroups: ["tightbeam.dev"]
resources: ["tightbeammodels", "tightbeamchannels"]
verbs: ["get", "list", "watch"]Secrets are referenced by name in Job specs. Kubelet handles the mount. The controller never touches credential bytes.
- API keys and bot tokens never appear in gRPC messages
- API keys and bot tokens never appear in controller memory
- Credentials only exist in ephemeral Job pods, mounted by kubelet
- Controller RBAC has zero Secret read access
- Job TTL ensures completed pods are cleaned up (30 seconds)
- Each Job mounts exactly one Secret (one credential, one blast radius)
- All images are FROM scratch with musl static builds
- All images signed with cosign (keyless, sigstore)
crates/
tightbeam-providers/ # LLM provider abstraction + shared types
tightbeam-proto/ # gRPC proto definitions (tightbeam.v1)
tightbeam-controller/ # k8s controller binary
tightbeam-llm-job/ # LLM Job binary
Container images are published to GHCR on each release:
ghcr.io/calebfaruki/tightbeam-controller:latest
ghcr.io/calebfaruki/tightbeam-llm-job:latest
- Apply CRDs:
kubectl apply -f deploy/crds/ - Apply RBAC:
kubectl apply -f deploy/rbac.yaml - Deploy the controller (via Helm chart from sycophant, or manually)
- Create
TightbeamModelandTightbeamChannelresources in the workspace namespace