PoC: OpenShell integration#393
PoC: OpenShell integration#393josh-pritchard wants to merge 4 commits intoagentregistry-dev:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Proof-of-concept integration of OpenShell as a third deployment platform (alongside local and kubernetes), including vendored OpenShell gRPC protos/clients, adapter wiring, provider seeding, UI provider selection, and E2E scaffolding. The PR also includes an unrelated UI static-export routing change in the API server.
Changes:
- Add
openshelldeployment adapter + gRPC client (mTLS + gateway discovery) with unit tests. - Vendor OpenShell protos + generated Go stubs, plus a Makefile sync target and a DB seed migration for
openshell-default. - Update UI deploy dialog to select a provider dynamically; extend E2E deploy targets and local docker-compose dev config.
Reviewed changes
Copilot reviewed 24 out of 25 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
internal/registry/platforms/openshell/client.go |
OpenShell gRPC client (endpoint discovery + mTLS) and basic sandbox operations |
internal/registry/platforms/openshell/client_test.go |
Unit tests for gateway metadata / TLS loading and small helpers |
internal/registry/platforms/openshell/deployment_adapter.go |
OpenShell DeploymentPlatformAdapter implementation (deploy, undeploy, logs, cancel) |
internal/registry/platforms/openshell/deployment_adapter_test.go |
Mock-based unit tests for adapter behavior (deploy polling, etc.) |
internal/registry/platforms/openshell/provider_config.go |
Provider-level config type for the OpenShell platform |
internal/registry/platforms/openshell/proto/OPENSHELL_PROTO_VERSION |
Pinned upstream proto version marker |
internal/registry/platforms/openshell/proto/openshell.proto |
Vendored OpenShell service proto |
internal/registry/platforms/openshell/proto/datamodel.proto |
Vendored OpenShell datamodel proto |
internal/registry/platforms/openshell/proto/sandbox.proto |
Vendored sandbox policy proto |
internal/registry/platforms/openshell/proto/inference.proto |
Vendored inference proto |
internal/registry/platforms/openshell/proto/gen/openshell.pb.go |
Generated Go stubs for openshell.proto |
internal/registry/platforms/openshell/proto/gen/openshell_grpc.pb.go |
Generated Go gRPC stubs for OpenShell service |
internal/registry/platforms/openshell/proto/gen/datamodel.pb.go |
Generated Go stubs for datamodel.proto |
internal/registry/platforms/openshell/proto/gen/sandbox.pb.go |
Generated Go stubs for sandbox.proto |
internal/registry/platforms/openshell/proto/gen/inference.pb.go |
Generated Go stubs for inference.proto |
internal/registry/platforms/openshell/proto/gen/inference_grpc.pb.go |
Generated Go gRPC stubs for Inference service |
internal/registry/registry_app.go |
Wire openshell into the deployment platform adapter map |
internal/registry/api/handlers/v0/provider_adapters.go |
Register openshell provider adapter |
internal/registry/database/migrations/011_seed_openshell_provider.sql |
Seed openshell-default provider |
internal/daemon/docker-compose.yml |
Local dev env vars + mTLS mount for OpenShell gateway access |
Makefile |
Add sync-openshell-proto target to fetch protos and regenerate stubs |
e2e/deploy_test.go |
Add OpenShell deploy targets + helpers for image loading / sandbox verification |
ui/components/deploy-server-dialog.tsx |
UI deploy dialog: provider dropdown (instead of hardcoded local) |
ui/lib/platform-display.ts |
Platform display mapping for UI labels/descriptions |
internal/registry/api/server.go |
SPA/static-export routing logic for embedded UI assets |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return nil, fmt.Errorf("deployment is required: %w", database.ErrInvalidInput) | ||
| } | ||
| sandboxName := sandboxNameForDeployment(deployment) | ||
| return client.GetSandboxLogs(ctx, sandboxName) | ||
| } |
There was a problem hiding this comment.
GetLogs() passes the sandbox name into Client.GetSandboxLogs(), but the gRPC request uses GetSandboxLogsRequest.sandbox_id (see grpcClient.GetSandboxLogs). Unless OpenShell treats name == id, this will fail to fetch logs. Suggested fix: persist the sandbox ID in DeploymentActionResult.ProviderMetadata during Deploy (from CreateSandbox response) and use that for log retrieval, or resolve name->id via GetSandbox before calling GetSandboxLogs.
| // Try the exact path first (handles static assets like .js, .css, etc.) | ||
| if f, err := httpFS.Open(path); err == nil { | ||
| f.Close() | ||
| // Check if it's a file (not a directory) — serve it directly | ||
| if stat, err := f.(interface{ Stat() (fs.FileInfo, error) }).Stat(); err == nil && !stat.IsDir() { | ||
| fileServer.ServeHTTP(w, r) |
There was a problem hiding this comment.
The UI handler’s existence check is broken: it calls httpFS.Open(r.URL.Path) with a leading "/" (http.FS expects paths without the leading slash), and it closes the file before calling Stat() on it. As written, this will fail to detect/serve existing files and can fall through to 404/SPAs incorrectly. Fix by normalizing the path (e.g., strings.TrimPrefix(path, "/")) for all Open() calls, and only closing after Stat() (use a defer close).
| slog.Info("openshell: deploy started", "server", req.ServerName, "provider", req.ProviderID) | ||
| client, err := a.getClient() | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| slog.Info("openshell: client ready") | ||
| if err := utils.ValidateDeploymentRequest(req, false); err != nil { | ||
| return nil, err | ||
| } |
There was a problem hiding this comment.
Deploy() logs req.ServerName/req.ProviderID before validating that req is non-nil. ValidateDeploymentRequest already checks for nil, but it’s called after this log line, so a nil request will panic. Move validation before any field access/logging of req.
| slog.Info("openshell: deploy started", "server", req.ServerName, "provider", req.ProviderID) | |
| client, err := a.getClient() | |
| if err != nil { | |
| return nil, err | |
| } | |
| slog.Info("openshell: client ready") | |
| if err := utils.ValidateDeploymentRequest(req, false); err != nil { | |
| return nil, err | |
| } | |
| if err := utils.ValidateDeploymentRequest(req, false); err != nil { | |
| return nil, err | |
| } | |
| slog.Info("openshell: deploy started", "server", req.ServerName, "provider", req.ProviderID) | |
| client, err := a.getClient() | |
| if err != nil { | |
| return nil, err | |
| } | |
| slog.Info("openshell: client ready") |
| func (a *openshellDeploymentAdapter) Undeploy(_ context.Context, deployment *models.Deployment) error { | ||
| client, err := a.getClient() | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if err := utils.ValidateDeploymentRequest(deployment, true); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) | ||
| defer cancel() | ||
|
|
There was a problem hiding this comment.
Undeploy() discards the caller’s context and always uses context.Background() with a new timeout. This prevents request cancellation/deadlines from propagating (and differs from the local/kubernetes adapters which use the passed ctx). Prefer deriving the timeout from the incoming ctx (context.WithTimeout(ctx, …)).
|
I was able to partially get it to work:
Deployment work. You can deploy an agent to openshell. However, for whatever reason I wasn't able to set the Anyway, so you can still make it work manually after that: Then, start port forward to the sandbox: And then you can send the A2A request to localhost:9999 and that will work. Interestingly, if I provide the dockerfile directly to the sandbox create command, that works without issues: |
Summary
Proof-of-concept integration of OpenShell as a third deployment platform alongside
local(Docker Compose) andkubernetes. OpenShell provides secure, sandboxed agent execution with defense-in-depth isolation (Landlock LSM, seccomp-bpf, network namespaces, inference routing).This PoC validates the end-to-end flow: register agent → build image → deploy to OpenShell sandbox → sandbox reaches READY. It is not merge-ready — see "Not working / Known gaps" below.
What's working
client.go): Connects to an OpenShell gateway with mTLS authentication. Supports endpoint discovery via env vars (OPENSHELL_GATEWAY_ENDPOINT) or filesystem config (~/.config/openshell/gateways/). Lazy client initialization — AR server starts fine without OpenShell installed.deployment_adapter.go): FullDeploymentPlatformAdapterimplementation — Deploy, Undeploy, GetLogs, Cancel. PollsGetSandboxuntil the sandbox reaches READY phase (120s timeout).openshellplatform registered in provider adapters and deployment platform map. DB migration seeds anopenshell-defaultprovider.make sync-openshell-proto) fetches protos from NVIDIA/OpenShell at a pinned version and generates Go stubs. Both protos and generated code are checked in so builds don't require protoc.providerId: "local". Platform display names mapopenshell→ "OpenShell".Not working / Known gaps
iproute2(foripbinary) and asandboxuser/group. Standard agent images will fail without these. See "Image requirements" below.arctl deploy createblocks synchronously for up to 120s while the sandbox provisions. The arctl HTTP client can timeout before the server-side deploy completes. Needs async deploy pattern (return immediately, poll status).arctl agent runonly works with the local Docker Compose platform. There is nodeploy invokeordeploy chatcommand.server.gofor Next.js static export routing (.htmlsuffix resolution, SPA fallback). Should be split into a separate PR before merge.Image requirements for OpenShell
OpenShell's supervisor binary is sideloaded into user containers at runtime. It requires:
iproute2package — supervisor shells out toipfor network namespace creation (veth pairs, netns). Without it, sandbox creation fails withENOENT.iptablespackage (optional) — enables network bypass detection.sandboxuser and group — supervisor drops privileges to this user after setup.For Alpine/Wolfi-based images (like
kagent-adk):These requirements should eventually be handled automatically by
arctl agent buildwhen targeting OpenShell, rather than requiring manual Dockerfile changes.Why OpenShell is a good fit for AR
localplatform (Docker Compose) has zero isolation. OpenShell provides Landlock, seccomp, network namespaces, and inference routing out of the box — same UX, defense-in-depth security.localnorkubernetesoffers.Local testing steps