fix(deploy): rewrite manifests to AceDataCloud platform conventions; site is live by acedatacloud-dev · Pull Request #14 · AceDataCloud/X402Guard

acedatacloud-dev · 2026-05-04T18:42:57Z

PR #11–13 shipped manifests modeled on a generic K8s setup. None of those actually fit the AceDataCloud TKE cluster + nginx-router ingress + wildcard-cert convention, so when the user opened https://x402guard.acedata.cloud/ they got a Kubernetes Ingress Controller Fake Certificate + 404 (the LB had no rule for the host).

This PR aligns everything with the platform's conventions and the site is now live at https://x402guard.acedata.cloud/ with a real Let's Encrypt cert.

Conventions adopted (matching Wisdom + Nexior + MCPs/* in this org)

	before (PR #11)	after
namespace	`x402guard`	`acedatacloud`
ingress class	`ingressClassName: nginx`	`kubernetes.io/ingress.class: nginx-router` annotation
TLS secret	`x402guard-tls` + cert-manager annotation	`tls-wildcard-acedata-cloud` (already in cluster)
image-pull secret	missing	`docker-registry`
build tag	`__BUILD__`	`${TAG}` sed-substituted by run.sh
service names	`api` / `web`	`x402guard-api` / `x402guard-web` (avoid collision in shared ns)
storage class	`cbs` (default)	`cbs-ssd` (cbs is Immediate-binding zone-pinned, fails)

What lands

Deleted: namespace.yaml, configmap.yaml (env values inlined into Deployment)
Updated: api.yaml, web.yaml, ingress.yaml to platform conventions
Updated: deploy/run.sh (sed ${TAG}, applies 4 yaml in order)
New: postgres.yaml — single-replica StatefulSet on cbs-ssd / 10Gi PVC. Cluster has no shared Postgres; x402guard hosts its own.
Updated: docker-compose.yaml service names → x402guard-api / x402guard-web so nginx upstream matches both environments
Updated: web/deploy/nginx.conf proxy_pass to x402guard-api:8000

Live verification

$ curl -sS https://x402guard.acedata.cloud/health
  {"status":"ok","version":"0.1.0"}
$ curl -sS https://x402guard.acedata.cloud/.well-known/x402guard
  {"service":"x402guard","version":"0.1.0","cluster":"mainnet",
   "agent_vault_program_id":"5s9rscxc...","usdc_mint":"EPjFWdd5..."}
$ openssl s_client ... -showcerts | openssl x509 -noout -subject -issuer
  subject=CN=acedata.cloud
  issuer=Let's Encrypt E8

Pods (kubectl -n acedatacloud get pods -l app=x402guard):
  x402guard-api-...      1/1 Running  (×2)
  x402guard-postgres-0   1/1 Running
  x402guard-web-...      1/1 Running  (×2)

Bugs caught while bringing the cluster live

Worth recording so the next deploy doesn't hit them again:

exec format error — docker compose build on macOS produces darwin/arm64 images. Cluster is amd64. Fix: docker buildx --platform linux/amd64 for the local-deploy fallback path. CI workflow already does this implicitly via docker/build-push-action.
PVC Pending on cbs — cbs is Immediate-binding zone-pinned and the cluster's picked zone had no spare capacity. cbs-ssd uses WaitForFirstConsumer.
cbs-ssd minimum is 10 Gi — 5 Gi requests fail with "disk size is invalid. Must in [10, 32000]".

Out of scope

The CI workflow .github/workflows/deploy.yaml doesn't run yet (DEPLOY_TO_K8S repo var unset). This first deploy was driven from a workstation using the kubeconfig pulled via .claude/scripts/tke.py. Subsequent deploys go through CI once cluster credentials are loaded into the GHCR-secrets vault.

…site is live PR #11/#12/#13 shipped manifests modeled on a generic K8s setup. None of those actually fit the AceDataCloud TKE cluster + nginx-router ingress + wildcard-cert convention, so when the user opened https://x402guard.acedata.cloud/ they got a "Kubernetes Ingress Controller Fake Certificate" + 404 (the LB had no rule for the host). This PR aligns everything with the platform's conventions and the site is now live at https://x402guard.acedata.cloud/ with a real Let's Encrypt cert from the existing tls-wildcard-acedata-cloud secret. Conventions adopted (matching Wisdom + Nexior + MCPs/* in this org): namespace acedatacloud (was: x402guard) ingress class annotation kubernetes.io/ingress.class: nginx-router (was: ingressClassName: nginx) TLS secret tls-wildcard-acedata-cloud, already in the cluster, signed *.acedata.cloud (was: x402guard-tls + cert-manager annotation) image-pull secret docker-registry, already in the namespace (was: missing imagePullSecrets entirely) build tag ${TAG} substituted by sed in deploy/run.sh (was: __BUILD__) service names x402guard-api / x402guard-web — qualified with project prefix to avoid colliding with other tenants in acedatacloud namespace (was: api / web) storage class cbs-ssd (WaitForFirstConsumer, 10Gi minimum) (was: cbs default — fails to bind because cbs is Immediate-binding zone-pinned) What changes: deploy/production/ namespace.yaml DELETED (use existing acedatacloud ns) configmap.yaml DELETED (env values inlined into Deployment) api.yaml namespace + names + imagePullSecrets + annotation; ${TAG} placeholder web.yaml same ingress.yaml nginx-router annotation; tls-wildcard-acedata-cloud; 5 path rules (/api, /mcp, /.well-known, /health, /) all on a single Ingress postgres.yaml NEW — single-replica StatefulSet on cbs-ssd with a 10Gi PVC. POSTGRES_PASSWORD reads from the same x402guard-secrets the api consumes. Cluster has no shared Postgres so x402guard hosts its own. deploy/run.sh Sed ${TAG} -> $BUILD_NUMBER + apply 4 yaml in order; rollout wait + /health probe. Bails clearly if the secret is missing. docker-compose.yaml Service names renamed api -> x402guard-api / web -> x402guard-web so the nginx upstream `x402guard-api` works in both docker-compose and K8s without separate configs. web/deploy/nginx.conf proxy_pass updated to http://x402guard-api:8000 in all 4 locations. Live verification (against https://x402guard.acedata.cloud/): $ curl -sS https://x402guard.acedata.cloud/health {"status":"ok","version":"0.1.0"} $ curl -sS https://x402guard.acedata.cloud/.well-known/x402guard {"service":"x402guard","version":"0.1.0","cluster":"mainnet", "agent_vault_program_id":"5s9rscxc...","usdc_mint":"EPjFWdd5..."} $ curl -sS https://x402guard.acedata.cloud/ | grep '<title>' <title>x402guard - Solana-native AI agent wallets</title> $ openssl s_client ... | openssl x509 -noout -subject -issuer subject=CN=acedata.cloud issuer=Let's Encrypt E8 Pods (kubectl -n acedatacloud get pods -l app=x402guard): x402guard-api-79c7d796b7-cdlpd 1/1 Running x402guard-api-79c7d796b7-f9mpc 1/1 Running x402guard-postgres-0 1/1 Running x402guard-web-5869d7cd49-29772 1/1 Running x402guard-web-5869d7cd49-zvgcb 1/1 Running Bugs caught while bringing the cluster live (not in this PR but worth recording so the next deploy doesn't hit them again): - Initial image push was darwin/arm64 because docker compose build uses host arch on macOS. Cluster is amd64 -> CrashLoopBackOff with "exec format error". Fix: use docker buildx --platform linux/amd64. The CI workflow .github/workflows/deploy.yaml already does this via docker/build-push-action which defaults to linux/amd64, but the local-deploy fallback path needs the explicit platform flag. - cbs storage class is Immediate-binding zone-pinned and our cluster happened to have no spare capacity in the picked zone, so PVCs stayed Pending. cbs-ssd uses WaitForFirstConsumer and binds in the same zone the pod actually scheduled into. - cbs-ssd minimum disk size is 10Gi (Tencent Cloud limit). 5Gi requests fail with "disk size is invalid. Must in [10, 32000]". Out of scope: - The CI workflow .github/workflows/deploy.yaml doesn't run yet (DEPLOY_TO_K8S repo var unset). This first deploy was driven from a workstation using the kubeconfig pulled via .claude/scripts/tke.py. Subsequent deploys will go through CI once the cluster credentials are loaded into the GHCR-secrets vault.

acedatacloud-dev merged commit 0c415ea into main May 4, 2026
1 check passed

acedatacloud-dev deleted the fix/k8s-manifests-and-deploy branch May 4, 2026 18:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deploy): rewrite manifests to AceDataCloud platform conventions; site is live#14

fix(deploy): rewrite manifests to AceDataCloud platform conventions; site is live#14
acedatacloud-dev merged 1 commit into
mainfrom
fix/k8s-manifests-and-deploy

acedatacloud-dev commented May 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

acedatacloud-dev commented May 4, 2026

Conventions adopted (matching Wisdom + Nexior + MCPs/* in this org)

What lands

Live verification

Bugs caught while bringing the cluster live

Out of scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant