redhat-developer · durandom · Jun 9, 2026 · Jun 9, 2026
@@ -1,6 +1,7 @@
 # rhdh-fullsend
 
-Custom fullsend sandbox images for the RHDH team's agent infrastructure.
+Custom fullsend sandbox images and deployment documentation for the RHDH
+team's agent infrastructure.
 
 ## Why this repo exists
 
@@ -14,6 +15,16 @@ workaround (a `host_files`-mounted shell script) is fragile.
 This repo builds a single image that extends `fullsend-code:latest` with
 corepack and yarn pre-activated.
 
+## Documentation
+
+| Doc | What it covers |
+|-----|---------------|
+| [Local Setup](docs/local-setup.md) | Podman VM, OpenShell gateway, GCP credentials, running agents locally |
+| [Repo Onboarding](docs/repo-onboarding.md) | Installing fullsend on a new RHDH repo (standard and manual methods) |
+| [GCP Infrastructure](docs/gcp-infrastructure.md) | GCP project, WIF providers, IAM, service accounts |
+| [Sandbox Networking](docs/sandbox-networking.md) | DNS inside OpenShell sandboxes — why it fails, workarounds |
+| [Known Issues](docs/known-issues.md) | Active friction points, workarounds, upstream tracking |
+
 ## Image
 
 ```
@@ -52,8 +63,7 @@ This replaces the `sandbox-yarn-setup.sh` + `host_files` workaround.
 
 ## Local agent runs
 
-See [docs/local-setup.md](docs/local-setup.md) for the full guide: Podman VM,
-OpenShell gateway, GCP credentials, SSH tunnel, and running agents end-to-end.
+See [Local Setup](docs/local-setup.md) for running agents on macOS.
 
 ## Local build
 

@@ -0,0 +1,233 @@
+# GCP Infrastructure
+
+GCP project, Workload Identity Federation, IAM, and service account reference
+for the RHDH fullsend setup.
+
+## Project context
+
+| Field | Value |
+|-------|-------|
+| GCP project ID | `rhdh-sidekick-167988` |
+| GCP project number | `189673402608` |
+| Vertex AI region | `us-east5` |
+| WIF pool | `fullsend-pool` (ACTIVE) |
+| IAM admin group | `rhdh-sidekick@redhat.com` |
+
+The project lives under `IT Public Cloud > Sandbox > Customers` in the GCP
+org hierarchy. The admin group has `iam.workloadIdentityPoolAdmin`,
+`iam.serviceAccountAdmin`, and `iam.serviceAccountKeyAdmin` — sufficient to
+self-provision WIF providers and service accounts without fullsend team
+involvement.
+
+**Conditional IAM restriction:** The `projectIamAdmin` role on this project
+is restricted to granting only `roles/aiplatform.user`:
+
+```
+expression: api.getAttribute('iam.googleapis.com/modifiedGrantsByRole',
+  []).hasOnly(['roles/aiplatform.user'])
+```
+
+This means you cannot grant yourself additional roles or enable APIs. All
+changes beyond `aiplatform.user` must go through IT (ServiceNow ticket).
+
+## WIF providers
+
+Each repo gets its own WIF provider, scoped via `attribute-condition` to
+that specific repository.
+
+### Current providers
+
+| Provider | Repo scope | State |
+|----------|-----------|-------|
+| `gh-redhat-developer-rhdh-agentic` | `redhat-developer/rhdh-agentic` | ACTIVE |
+| `gh-redhat-developer-rhdh-plugins` | `redhat-developer/rhdh-plugins` | ACTIVE |
+| `gh-rhdeveloper-plugin-export` | `redhat-developer/rhdh-plugin-export-overlays` | ACTIVE |
+
+### Creating a new WIF provider
+
+```bash
+PROVIDER_NAME="gh-redhat-developer-<repo>"  # max 32 chars
+PROVIDER_PATH="projects/189673402608/locations/global/workloadIdentityPools/fullsend-pool/providers/${PROVIDER_NAME}"
+
+gcloud iam workload-identity-pools providers create-oidc "$PROVIDER_NAME" \
+  --location=global \
+  --workload-identity-pool=fullsend-pool \
+  --project=rhdh-sidekick-167988 \
+  --issuer-uri=https://token.actions.githubusercontent.com \
+  --allowed-audiences="fullsend-mint,https://iam.googleapis.com/${PROVIDER_PATH}" \
+  --attribute-mapping="google.subject=assertion.sub,attribute.actor=assertion.actor,attribute.repository=assertion.repository,attribute.repository_owner=assertion.repository_owner" \
+  --attribute-condition="assertion.repository == '<org>/<repo>'"
+```
+
+### Dual-audience requirement
+
+Two audiences are required in `--allowed-audiences`:
+
+| Audience | Used by | Step |
+|----------|---------|------|
+| `fullsend-mint` | Mint token exchange | GitHub OIDC → fullsend session token |
+| `https://iam.googleapis.com/projects/189673402608/.../providers/<name>` | `google-github-actions/auth` | GCP credential setup for Vertex AI |
+
+Omitting the second audience causes an `audience mismatch` error at the
+"Setup GCP" step in the workflow. The `fullsend admin install` CLI sets
+both automatically; manual provider creation must include both.
+
+### IAM binding
+
+The existing `aiplatform.user` binding covers all `redhat-developer` repos
+via the `attribute.repository_owner` principal set:
+
+```
+principalSet://iam.googleapis.com/projects/189673402608/locations/global/workloadIdentityPools/fullsend-pool/attribute.repository_owner/redhat-developer
+```
+
+No per-repo IAM binding is needed after the initial setup.
+
+## Service accounts
+
+For local agent runs (not CI). See also
+[Local Setup — GCP Credentials](local-setup.md#step-3-gcp-credentials).
+
+### Creating a service account
+
+```bash
+gcloud iam service-accounts create fullsend-local \
+  --display-name="Fullsend local agent runner" \
+  --project=rhdh-sidekick-167988
+```
+
+There is a propagation delay of a few seconds before the SA can be used in
+IAM bindings.
+
+### Granting Vertex AI access
+
+```bash
+gcloud projects add-iam-policy-binding rhdh-sidekick-167988 \
+  --member="serviceAccount:fullsend-local@rhdh-sidekick-167988.iam.gserviceaccount.com" \
+  --role="roles/aiplatform.user" \
+  --condition=None
+```
+
+`--condition=None` is required because the project has conditional IAM
+bindings. Without it, `gcloud` prompts interactively.
+
+### Creating a JSON key
+
+```bash
+gcloud iam service-accounts keys create \
+  ~/.config/fullsend/fullsend-local-credentials.json \
+  --iam-account=fullsend-local@rhdh-sidekick-167988.iam.gserviceaccount.com
+
+chmod 600 ~/.config/fullsend/fullsend-local-credentials.json
+```
+
+The key file contains a private key. Do not commit it to git or share via
+Slack. If compromised:
+
+```bash
+KEY_ID=$(python3 -c "import json,sys; print(json.load(sys.stdin)['private_key_id'])" \
+  < ~/.config/fullsend/fullsend-local-credentials.json)
+gcloud iam service-accounts keys delete "$KEY_ID" \
+  --iam-account=fullsend-local@rhdh-sidekick-167988.iam.gserviceaccount.com
+```
+
+### Per-person service accounts
+
+For individual usage tracking, create per-person SAs:
+
+```bash
+NAME="fullsend-local-<name>"  # kebab-case, max 30 chars
+
+gcloud iam service-accounts create "$NAME" \
+  --display-name="Fullsend local – <Name>" \
+  --project=rhdh-sidekick-167988
+
+gcloud projects add-iam-policy-binding rhdh-sidekick-167988 \
+  --member="serviceAccount:${NAME}@rhdh-sidekick-167988.iam.gserviceaccount.com" \
+  --role="roles/aiplatform.user" \
+  --condition=None
+
+gcloud iam service-accounts keys create "/tmp/${NAME}-credentials.json" \
+  --iam-account="${NAME}@rhdh-sidekick-167988.iam.gserviceaccount.com"
+```
+
+Share the key file securely (Bitwarden, 1Password — never Slack or email)
+and delete the local copy.
+
+### Key rotation
+
+Create a new key before deleting the old one to avoid downtime:
+
+```bash
+gcloud iam service-accounts keys create \
+  ~/.config/fullsend/fullsend-local-credentials-new.json \
+  --iam-account=fullsend-local@rhdh-sidekick-167988.iam.gserviceaccount.com
+
+# Test with the new key, then:
+OLD_KEY_ID=$(python3 -c "import json,sys; print(json.load(sys.stdin)['private_key_id'])" \
+  < ~/.config/fullsend/fullsend-local-credentials.json)
+gcloud iam service-accounts keys delete "$OLD_KEY_ID" \
+  --iam-account=fullsend-local@rhdh-sidekick-167988.iam.gserviceaccount.com
+
+mv ~/.config/fullsend/fullsend-local-credentials-new.json \
+  ~/.config/fullsend/fullsend-local-credentials.json
+```
+
+## IAM troubleshooting
+
+### "Permission 'aiplatform.endpoints.predict' denied"
+
+The WIF principal has no `roles/aiplatform.user` binding. Verify:
+
+```bash
+gcloud projects get-iam-policy rhdh-sidekick-167988 \
+  --flatten="bindings[].members" \
+  --filter="bindings.members:principalSet" \
+  --format="table(bindings.role, bindings.members)"
+```
+
+If the binding is missing, add it using the org-level principal set (covers
+all repos under `redhat-developer`):
+
+```bash
+gcloud projects add-iam-policy-binding rhdh-sidekick-167988 \
+  --role="roles/aiplatform.user" \
+  --member="principalSet://iam.googleapis.com/projects/189673402608/locations/global/workloadIdentityPools/fullsend-pool/attribute.repository_owner/redhat-developer" \
+  --condition=None
+```
+
+### Installer claims success but binding is missing
+
+The `fullsend admin install` CLI may report "granted roles/aiplatform.user"
+even when the conditional `projectIamAdmin` role silently blocks the grant.
+Always verify with `gcloud projects get-iam-policy` after install. IAM
+propagation can take up to 7 minutes.
+
+### "audience mismatch" at Setup GCP step
+
+The WIF provider was created with only one allowed audience. Update it to
+include both:
+
+```bash
+gcloud iam workload-identity-pools providers update-oidc <provider-name> \
+  --location=global \
+  --workload-identity-pool=fullsend-pool \
+  --project=rhdh-sidekick-167988 \
+  --allowed-audiences="fullsend-mint,https://iam.googleapis.com/<wif-provider-path>"
+```
+
+### Monitoring Vertex AI usage
+
+Via GCP Console: Vertex AI → Model Garden → Usage page. Filter by service
+account for per-SA token consumption.
+
+Via CLI:
+
+```bash
+gcloud logging read \
+  'resource.type="aiplatform.googleapis.com/Endpoint" AND
+   protoPayload.authenticationInfo.principalEmail="fullsend-local@rhdh-sidekick-167988.iam.gserviceaccount.com"' \
+  --project=rhdh-sidekick-167988 \
+  --limit=10 \
+  --format="table(timestamp, protoPayload.request.model, protoPayload.response.usageMetadata)"
+```
@@ -0,0 +1,83 @@
+# Known Issues
+
+Active friction points, workarounds, and upstream tracking for the RHDH
+fullsend setup. Last updated: 2026-06-09.
+
+## Sandbox and tooling
+
+| Issue | Impact | Workaround | Status |
+|-------|--------|------------|--------|
+| DNS broken inside sandboxes | `yarn install`, `pip install`, `git clone` fail with `getaddrinfo EAI_AGAIN` | Explicit `httpProxy`/`httpsProxy` in `.yarnrc.yml` pointing to the L7 proxy | By design — see [Sandbox Networking](sandbox-networking.md) |
+| `yarn install` takes 10-15 min in sandbox | Monorepo overhead for large workspaces | Custom image with yarn pre-installed eliminates bootstrap; consider pre-installing deps | Open |
+| Git hooks (husky) need yarn in PATH | Hooks run in subprocesses without the agent's PATH | Custom image with `/usr/local/bin/yarn` wrapper — see [rhdh-fullsend-code image](../README.md) | Solved |
+| Sandbox creation timeout (60s) for large images | Code agent uses `fullsend-code:latest` (larger than triage sandbox) | Upstream fix exists (pre-pull + retry + 120s timeout) but not in `@v0` tag. Set `FULLSEND_SANDBOX_READY_TIMEOUT=180` as env var. | Fixed upstream, pending `@v0` release |
+| `/etc/resolv.conf` points to unreachable nameserver | Tools timeout instead of failing fast | None — consider filing OpenShell issue | Open |
+
+## Agent behavior
+
+| Issue | Impact | Workaround | Status |
+|-------|--------|------------|--------|
+| Triage doesn't auto-trigger on `issues/opened` | Must use `/fs-triage` slash command | Post `/fs-triage` as issue comment | By design — dispatcher only handles `issues/labeled` |
+| Coder doesn't auto-trigger from triage | Triage labels `triaged`, not `ready-to-code` | Post `/fs-code` manually after triage | By design |
+| Fix only triggers from bot reviews | Human `changes_requested` reviews don't trigger fix agent | Post `/fs-fix` manually | By design |
+| Retro dropped by concurrency group collision | Retro job gets cancelled by other dispatch jobs | Post `/fs-retro` manually in a quiet window | Open |
+| Custom agent stages not supported in per-repo mode | Cannot register custom `/fs-*` slash commands | Extend existing agents with custom skills instead of building standalone agents | Architectural limitation |
+
+## Monorepo-specific
+
+| Issue | Impact | Workaround | Status |
+|-------|--------|------------|--------|
+| No workspace awareness | Agent sees full repo context, not just the workspace a PR touches | `paths` filter on `pull_request_target` for workspace-level triggering | Partial — shim-level only |
+| Routing skill: label priority | Agent guesses workspace from title/body instead of trusting `workspace/*` label | Improve routing skill to prioritize existing labels | Open |
+| Routing skill not in triage harness | Triage has no workspace awareness — can misroute issues | Add routing skill to triage harness | Open |
+| `workspace/*` labels not automated | Must manually create labels when adding workspaces | Automate label creation when a new workspace is added | Open |
+
+## Observability
+
+| Issue | Impact | Workaround | Status |
+|-------|--------|------------|--------|
+| Agent transcript not visible inline in GHA logs | Must download artifact separately | `gh run download <run-id> --name fullsend-code` | Open |
+| No summary in GHA step output | Hard to see what the agent did at a glance | Consider post-script step extracting key actions from transcript | Open |
+
+## Upstream harness drift
+
+Customized harness and policy files are **copies** of upstream (baseline
+2026-06-05). When upstream changes, our copies need manual sync:
+
+| File | Repo |
+|------|------|
+| `harness/code.yaml` | rhdh-plugins |
+| `harness/fix.yaml` | rhdh-plugins |
+| `policies/code.yaml` | rhdh-plugins |
+| `policies/fix.yaml` | rhdh-plugins |
+| `agents/code.md` | rhdh-plugins |
+
+## Upstream feature requests
+
+| Issue | Description | Status |
+|-------|-------------|--------|
+| [fullsend#1937](https://github.com/fullsend-ai/fullsend/issues/1937) | Native `working_dir` field in harness schema | Filed |
+| `repo.yarnpkg.com` missing from upstream policies | Any JS monorepo using corepack + yarn hits this | Not yet filed |
+| `sandbox_init_script` in harness schema | Pre-agent env setup without relying on `.env.d` or skills | Not yet filed |
+| [OpenShell#1107](https://github.com/NVIDIA/OpenShell/issues/1107) | `/etc/hosts` injection for policy-allowed hostnames | Open, assigned |
+
+## `@v0` tag regression risk
+
+Commit `709d8af0` (2026-05-15) fixed per-repo retro/prioritize routing by
+removing the `retro|prioritize → fullsend` stage-to-role mapping. However,
+PR #1187 (`005ac0a1`, 2026-05-19) re-introduced the old mapping on `main`.
+The `@v0` tag predates this regression, so per-repo mode is currently safe.
+
+**Risk:** If `@v0` advances past PR #1187, per-repo retro and prioritize
+will silently break for all consumer orgs whose config lists
+`retro`/`prioritize` instead of `fullsend`.
+
+## Public repo security
+
+Fullsend's `issue_comment` trigger routes to agents without checking
+`author_association`. Any external user posting `/fs-review` on a public
+repo's PR triggers Vertex AI inference on the repo owner's GCP project.
+
+**Mitigation:** Add an `author_association` check to the dispatch job in
+the shim workflow. Applied in rhdh-plugins and rhdh-plugin-export-overlays.
+See [Repo Onboarding — Method 2](repo-onboarding.md#method-2-manual-install-customized-shim).
@@ -100,23 +100,9 @@ service account key for the `rhdh-sidekick-167988` project with the
 **If your team lead provides the key file:** save it to
 `~/.config/fullsend/fullsend-local-credentials.json` and `chmod 600` it.
 
-**If you need to create the SA yourself:**
-
-```bash
-gcloud iam service-accounts create fullsend-local \
-  --display-name="Fullsend local agent runner" \
-  --project=rhdh-sidekick-167988
-
-gcloud projects add-iam-policy-binding rhdh-sidekick-167988 \
-  --member="serviceAccount:fullsend-local@rhdh-sidekick-167988.iam.gserviceaccount.com" \
-  --role="roles/aiplatform.user" \
-  --condition=None
-
-gcloud iam service-accounts keys create ~/.config/fullsend/fullsend-local-credentials.json \
-  --iam-account=fullsend-local@rhdh-sidekick-167988.iam.gserviceaccount.com
-
-chmod 600 ~/.config/fullsend/fullsend-local-credentials.json
-```
+**If you need to create the SA yourself:** see
+[GCP Infrastructure — Service Accounts](gcp-infrastructure.md#service-accounts)
+for the full `gcloud` commands (create SA, grant role, generate key, rotate).
 
 ## Step 4: Create env files