diff --git a/content/cluster-installation/hosted-control-plane/tenant-network/api-lb.conf b/content/cluster-installation/hosted-control-plane/tenant-network/api-lb.conf new file mode 100644 index 00000000..b5a7e9a1 --- /dev/null +++ b/content/cluster-installation/hosted-control-plane/tenant-network/api-lb.conf @@ -0,0 +1,29 @@ +global + log 127.0.0.1 local2 + pidfile /var/run/haproxy.pid + maxconn 4000 + daemon +defaults + mode http + log global + option dontlognull + option http-server-close + option redispatch + retries 3 + timeout http-request 10s + timeout queue 1m + timeout connect 10s + timeout client 1m + timeout server 1m + timeout http-keep-alive 10s + timeout check 10s + maxconn 3000 + +listen api + bind *:6443 + mode tcp + balance source + server ucs-blade-server-5 10.32.96.105:30918 check inter 1s + server ucs-blade-server-6 10.32.96.106:30918 check inter 1s + server ucs-blade-server-7 10.32.96.107:30918 check inter 1s + server ucs-blade-server-8 10.32.96.108:30918 check inter 1s diff --git a/content/cluster-installation/hosted-control-plane/tenant-network/index.md b/content/cluster-installation/hosted-control-plane/tenant-network/index.md new file mode 100644 index 00000000..aa039960 --- /dev/null +++ b/content/cluster-installation/hosted-control-plane/tenant-network/index.md @@ -0,0 +1,277 @@ +--- +title: Hosted Control Plane and tenant networking +linktitle: Hosted Control Plane and tenant networking +description: Hosted Control Plane and tenant networking +tags: ['hcp','v4.21'] +--- +# Hosted Control Plane and tenant networking + +Official documentation: Not yet available + +Tested with: + +|Component|Version| +|---|---| +|OpenShift|v4.21.9| +|OpenShift Virt|v4.21.0| + +## Overview + +Challenge: running a hosted cluster in a different tenant network segment or VLAN without wide-open access from the tenant segment to the management segment. + +Additional requirement: the hub cluster must not have arbitrary addressing or routing into the tenant network segment. The hub may only attach hosted-cluster workloads (for example, KubeVirt VMs) to that segment. + +![](overview.drawio){ page="Page-1" } + +Worker nodes are straightforward: attach them to the tenant network segment (DHCP or equivalent addressing is required). + +Exposing hosted control plane endpoints into the tenant segment is harder. The following components must be reachable from workers and clients in that segment: + +* API Server +* OAuth +* Konnectivity +* Ignition + +Here is a summary of common publishing options for these components: + +|Component/Service|Exposing strategy (`servicePublishingStrategy`)|Kubernetes Service type `LoadBalancer`|Route (OpenShift router)| +|---|---|---|---| +|API Server|
  • LoadBalancer (recommended; Kubernetes `LoadBalancer` service)
  • NodePort* (not for production)
  • |✅|❌| +|OAuth|
  • Route (default)
  • NodePort* (not for production)
  • |❌|✅| +|Konnectivity|
  • Route (default)
  • LoadBalancer (Kubernetes `LoadBalancer` service)
  • NodePort* (not for production)
  • |✅|✅| +|Ignition|
  • Route (default)
  • NodePort* (not for production)
  • |✅|❌| + +For this proof of concept, endpoints are exposed as follows: + +* API Server: `LoadBalancer` (fronted by external `api-lb` in the tenant segment; see below) +* OAuth, Konnectivity, Ignition: `Route` via a **dedicated ingress controller shard** on the hub, fronted by external `ingress-shared-lb` with VIPs/DNS in the tenant segment + +## Exposing components via a dedicated router shard + +Use a dedicated OpenShift Ingress Controller shard on the **hub** so only the hosted-cluster control-plane Routes are served by that shard. Tenant clients resolve OAuth, Konnectivity, and Ignition hostnames to `ingress-shared-lb`, which forwards to the shard’s NodePorts on the management network. + +Place an external load balancer in front of that shard (for example F5 BIG-IP or NetScaler) that can reach the hub’s management network and present stable tenant-facing VIPs or addresses. + +## Proof of concept environment overview + +![](overview.drawio){ page="Page-2" } + +### Router between Mgmt and Tenant-A + +[VyOS](https://vyos.io/) acts as router and firewall between management and Tenant-A. Restrict **lateral** traffic between the two segments (no full mesh); allow only what you need (for example DNS to resolvers, default route or NAT for internet egress). Hosted-cluster control-plane traffic from tenant nodes should flow to the **external load balancer VIPs** in the tenant segment (not directly into arbitrary management subnets). + +??? example "VyOS config commands" + + ```shell + --8<-- "content/cluster-installation/hosted-control-plane/tenant-network/vyos-router-2003.txt" + ``` + +### Ingress Sharding at Hub Cluster + +* [2.3.4. Ingress sharding in OpenShift Container Platform](https://docs.redhat.com/en/documentation/openshift_container_platform/4.21/html/ingress_and_load_balancing/configuring-ingress-cluster-traffic#nw-ingress-sharding-concept_configuring-ingress-cluster-traffic-ingress-controller) +* [3.1.3.8.1. Example load balancer configuration for user-provisioned clusters](https://docs.redhat.com/en/documentation/openshift_container_platform/4.21/html/installing_on_vmware_vsphere/user-provisioned-infrastructure) + +???+ example "Ingress Controller" + + ```yaml + --8<-- "content/cluster-installation/hosted-control-plane/tenant-network/ingress-controller-shard.yaml" + ``` + +```shell +% oc get svc -n openshift-ingress router-nodeport-tenant-a +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +router-nodeport-tenant-a NodePort 172.30.141.209 80:32460/TCP,443:32488/TCP,1936:32095/TCP 106s +``` + +The ingress shard load balancer is an RHEL 9 host running HAProxy (external load balancer `ingress-shared-lb`). + +* Install HAProxy: `dnf install haproxy` +* Configure SELinux: `setsebool -P haproxy_connect_any 1` +* Apply the example `haproxy` configuration (update ports to match your NodePort service) +* Enable and start HAProxy: `systemctl enable --now haproxy` + +??? example "HAProxy config" + + ```shell + --8<-- "content/cluster-installation/hosted-control-plane/tenant-network/ingress-shared-haproxy.conf" + ``` + +Add DNS records + +```bind +konnectivity.tenant-a.coe.muc.redhat.com. IN A 192.168.203.111 +oauth.tenant-a.coe.muc.redhat.com. IN A 192.168.203.111 +ignition.tenant-a.coe.muc.redhat.com. IN A 192.168.203.111 +``` + +### Deployment sequence (reference) + +Three external load balancers appear in this write-up; keep their roles distinct: + +| Name | Role | +|------|------| +| `ingress-shared-lb` | Tenant-facing VIPs for OAuth, Konnectivity, Ignition Routes on the **hub** ingress shard | +| `api-lb` | Tenant-facing VIP for the hosted cluster **API** (`APIServer` publishing) | +| `ingress-lb` | Tenant-facing VIP for **hosted cluster** application Routes (`*.apps…`) | + +Suggested order: (1) hub ingress shard + `ingress-shared-lb` + DNS for the three control-plane hostnames, (2) `api-lb` + API DNS, (3) `ingress-lb` + wildcard apps DNS, then (4) apply `HostedCluster` and `NodePool`. Adjust if your automation creates services first and you backfill DNS once NodePorts or service endpoints are known. + +The following two subsections describe (2) and (3); the hub shard and DNS for OAuth, Konnectivity, and Ignition are covered above. + +### Deploy External Load Balancer for API (`api-lb`) + +Use an RHEL 9 virtual machine with HAProxy. + +* Install HAProxy: `dnf install haproxy` +* Configure SELinux: `setsebool -P haproxy_connect_any 1` +* Apply the example `haproxy` configuration (update ports to match your environment) +* Enable and start HAProxy: `systemctl enable --now haproxy` + +??? example "HAProxy config" + + ```shell + --8<-- "content/cluster-installation/hosted-control-plane/tenant-network/api-lb.conf" + ``` + +Add DNS record: + +```bind +api.tenant-a.coe.muc.redhat.com. IN A 192.168.203. +``` + +### Deploy External Load Balancer for Ingress (`ingress-lb`) of hosted cluster + +Use an RHEL 9 virtual machine with HAProxy. + +* Install HAProxy: `dnf install haproxy` +* Configure SELinux: `setsebool -P haproxy_connect_any 1` +* Apply the example `haproxy` configuration (update ports to match your environment) +* Enable and start HAProxy: `systemctl enable --now haproxy` + +??? example "HAProxy config" + + ```shell + --8<-- "content/cluster-installation/hosted-control-plane/tenant-network/ingress-lb.conf" + ``` + +Add DNS record: + +```bind +*.apps.tenant-a.coe.muc.redhat.com. IN A 192.168.203. +``` + +### Start hosted control plane and nodepool + +```yaml hl_lines="11 43-66" title="HostedCluster" +apiVersion: hypershift.openshift.io/v1beta1 +kind: HostedCluster +metadata: + name: 'tenant-a' + namespace: 'clusters' + labels: + "cluster.open-cluster-management.io/clusterset": 'default' +spec: + configuration: + ingress: + appsDomain: apps.tenant-a.coe.muc.redhat.com # (1) + domain: '' + loadBalancer: + platform: + type: '' + channel: fast-4.21 + etcd: + managed: + storage: + persistentVolume: + size: 8Gi + type: PersistentVolume + managementType: Managed + release: + image: quay.io/openshift-release-dev/ocp-release:4.21.11-multi + pullSecret: + name: pullsecret-cluster-tenant-a + sshKey: + name: sshkey-cluster-tenant-a + networking: + clusterNetwork: + - cidr: 10.132.0.0/14 + serviceNetwork: + - cidr: 172.31.0.0/16 + networkType: OVNKubernetes + controllerAvailabilityPolicy: SingleReplica + infrastructureAvailabilityPolicy: SingleReplica + platform: + type: KubeVirt + kubevirt: + baseDomainPassthrough: false + infraID: 'tenant-a' + services: + - service: APIServer + servicePublishingStrategy: + type: LoadBalancer + loadBalancer: + hostname: api.tenant-a.coe.muc.redhat.com # (2) + - service: OAuthServer + servicePublishingStrategy: + type: Route + route: + hostname: oauth.tenant-a.coe.muc.redhat.com # (3) + - service: OIDC + servicePublishingStrategy: + type: Route + - service: Konnectivity + servicePublishingStrategy: + type: Route + route: + hostname: konnectivity.tenant-a.coe.muc.redhat.com # (4) + - service: Ignition + servicePublishingStrategy: + type: Route + route: + hostname: ignition.tenant-a.coe.muc.redhat.com # (5) +``` + +1. `appsDomain`: resolve names under `apps.tenant-a.coe.muc.redhat.com` to **`ingress-lb`** (hosted cluster ingress), not the hub shard. +2. API server `loadBalancer.hostname`: resolve to **`api-lb`**, which forwards to the `APIServer` publishing target on the hub. +3. OAuth `route.hostname`: resolve to **`ingress-shared-lb`** (hub dedicated shard). +4. Konnectivity `route.hostname`: resolve to **`ingress-shared-lb`**. +5. Ignition `route.hostname`: resolve to **`ingress-shared-lb`**. + +```yaml hl_lines="24-26" title="NodePool" +apiVersion: hypershift.openshift.io/v1beta1 +kind: NodePool +metadata: + name: 'tenant-a' + namespace: 'clusters' +spec: + arch: amd64 + clusterName: 'tenant-a' + replicas: 2 + management: + autoRepair: false + upgradeType: Replace + platform: + type: KubeVirt + kubevirt: + compute: + cores: 2 + memory: 8Gi + rootVolume: + type: Persistent + persistent: + size: 32Gi + additionalNetworks: + - name: default/cudn-localnet1-2003 # (1) + attachDefaultNetwork: false + release: + image: quay.io/openshift-release-dev/ocp-release:4.21.11-multi +``` + +1. Attach NodePool VMs to the tenant segment using a user-defined network (UDN) `localnet` attachment (`default/cudn-localnet1-2003` in this lab). + +## Open topics + +* Disable or constrain cloud provider integration so that Kubernetes `LoadBalancer` Service requests for the hosted cluster are not satisfied by the hub cluster cloud integration unless that is intentional. +* WebUI bug: ACM shows `https://console-openshift-console.apps.tenant-a.apps.ocp5.stormshift.coe.muc.redhat.com/` for the console, but the URL should be `https://console-openshift-console.apps.tenant-a.coe.muc.redhat.com/`. +* Add custom endpoint publishing strategy +* Find a solution for the NodePort chicken-and-egg problem of the external API load balancer diff --git a/content/cluster-installation/hosted-control-plane/tenant-network/ingress-controller-shard.yaml b/content/cluster-installation/hosted-control-plane/tenant-network/ingress-controller-shard.yaml new file mode 100644 index 00000000..e5c7b064 --- /dev/null +++ b/content/cluster-installation/hosted-control-plane/tenant-network/ingress-controller-shard.yaml @@ -0,0 +1,17 @@ +apiVersion: operator.openshift.io/v1 +kind: IngressController +metadata: + name: tenant-a + namespace: openshift-ingress-operator +spec: + domain: tenant-a.coe.muc.redhat.com + + endpointPublishingStrategy: + type: NodePortService + namespaceSelector: + matchExpressions: + - key: kubernetes.io/metadata.name + operator: In + values: + - ingress-test + - clusters-tenant-a diff --git a/content/cluster-installation/hosted-control-plane/tenant-network/ingress-lb.conf b/content/cluster-installation/hosted-control-plane/tenant-network/ingress-lb.conf new file mode 100644 index 00000000..518833e1 --- /dev/null +++ b/content/cluster-installation/hosted-control-plane/tenant-network/ingress-lb.conf @@ -0,0 +1,34 @@ +global + log 127.0.0.1 local2 + pidfile /var/run/haproxy.pid + maxconn 4000 + daemon +defaults + mode http + log global + option dontlognull + option http-server-close + option redispatch + retries 3 + timeout http-request 10s + timeout queue 1m + timeout connect 10s + timeout client 1m + timeout server 1m + timeout http-keep-alive 10s + timeout check 10s + maxconn 3000 + +listen ingress-router-443 + bind *:443 + mode tcp + balance source + server tenant-a-gngj5-mfwp6 192.168.203.101:30190 check inter 1s + server tenant-a-gngj5-rrbmv 192.168.203.102:30190 check inter 1s + +listen ingress-router-80 + bind *:80 + mode tcp + balance source + server tenant-a-gngj5-mfwp6 192.168.203.101:30282 check inter 1s + server tenant-a-gngj5-rrbmv 192.168.203.102:30282 check inter 1s diff --git a/content/cluster-installation/hosted-control-plane/tenant-network/ingress-shared-haproxy.conf b/content/cluster-installation/hosted-control-plane/tenant-network/ingress-shared-haproxy.conf new file mode 100644 index 00000000..e111e741 --- /dev/null +++ b/content/cluster-installation/hosted-control-plane/tenant-network/ingress-shared-haproxy.conf @@ -0,0 +1,39 @@ +global + log 127.0.0.1 local2 + pidfile /var/run/haproxy.pid + maxconn 4000 + daemon +defaults + mode http + log global + option dontlognull + option http-server-close + option redispatch + retries 3 + timeout http-request 10s + timeout queue 1m + timeout connect 10s + timeout client 1m + timeout server 1m + timeout http-keep-alive 10s + timeout check 10s + maxconn 3000 + +listen ingress-router-443 + bind *:443 + mode tcp + balance source + server ucs-blade-server-5 10.32.96.105:32488 check inter 1s + server ucs-blade-server-6 10.32.96.106:32488 check inter 1s + server ucs-blade-server-7 10.32.96.107:32488 check inter 1s + server ucs-blade-server-8 10.32.96.108:32488 check inter 1s + +listen ingress-router-80 + bind *:80 + mode tcp + balance source + server ucs-blade-server-5 10.32.96.105:32460 check inter 1s + server ucs-blade-server-6 10.32.96.106:32460 check inter 1s + server ucs-blade-server-7 10.32.96.107:32460 check inter 1s + server ucs-blade-server-8 10.32.96.108:32460 check inter 1s + diff --git a/content/cluster-installation/hosted-control-plane/tenant-network/overview.drawio b/content/cluster-installation/hosted-control-plane/tenant-network/overview.drawio new file mode 100644 index 00000000..4f97b08f --- /dev/null +++ b/content/cluster-installation/hosted-control-plane/tenant-network/overview.drawio @@ -0,0 +1,264 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/cluster-installation/hosted-control-plane/tenant-network/vyos-router-2003.txt b/content/cluster-installation/hosted-control-plane/tenant-network/vyos-router-2003.txt new file mode 100644 index 00000000..5e09bc44 --- /dev/null +++ b/content/cluster-installation/hosted-control-plane/tenant-network/vyos-router-2003.txt @@ -0,0 +1,27 @@ +set firewall group address-group ALLOWED-IPS address '10.32.96.1' +set firewall group address-group ALLOWED-IPS address '10.32.96.31' +set firewall group address-group ALLOWED-IPS address '10.32.111.254' +set firewall ipv4 forward filter rule 49 action 'accept' +set firewall ipv4 forward filter rule 49 description 'Allow IPs' +set firewall ipv4 forward filter rule 49 destination group address-group 'ALLOWED-IPS' +set firewall ipv4 forward filter rule 50 action 'drop' +set firewall ipv4 forward filter rule 50 description 'Drop enire coe lab' +set firewall ipv4 forward filter rule 50 destination address '10.32.96.0/20' + +set interfaces ethernet eth0 address 'dhcp' +set interfaces ethernet eth1 address '192.168.203.1/24' + +set nat source rule 100 outbound-interface name 'eth0' +set nat source rule 100 source address '192.168.203.0/24' +set nat source rule 100 translation address 'masquerade' +set service dhcp-server listen-interface 'eth1' +set service dhcp-server shared-network-name coe-2003 authoritative +set service dhcp-server shared-network-name coe-2003 subnet 192.168.203.0/24 option default-router '192.168.203.1' +set service dhcp-server shared-network-name coe-2003 subnet 192.168.203.0/24 option name-server '10.32.96.1' +set service dhcp-server shared-network-name coe-2003 subnet 192.168.203.0/24 range 1 start '192.168.203.100' +set service dhcp-server shared-network-name coe-2003 subnet 192.168.203.0/24 range 1 stop '192.168.203.200' +set service dhcp-server shared-network-name coe-2003 subnet 192.168.203.0/24 subnet-id '1' +set service ssh +set system host-name 'router-2003' +set system name-server '10.32.96.1' +set system name-server '10.32.96.31' diff --git a/mkdocs.yml b/mkdocs.yml index 688dc6c3..2ffcebde 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -43,6 +43,7 @@ theme: - navigation.indexes - navigation.tracking - content.code.copy + - content.code.annotate # Extras extra: @@ -57,6 +58,7 @@ extra: extra_javascript: - https://viewer.diagrams.net/js/viewer-static.min.js - javascripts/drawio-reload.js + # Extensions markdown_extensions: - pymdownx.emoji: @@ -99,6 +101,7 @@ plugins: verbose: false - glightbox - drawio: + viewer_js: "https://viewer.diagrams.net/js/viewer-static.min.js" toolbar: false # control if hovering on a diagram shows a toolbar for zooming or not (default: true) tooltips: false # control if tooltips will be shown (default: true) edit: false # control if edit button will be shown in the lightbox view (default: true) @@ -131,6 +134,7 @@ nav: - Hosted Control Plane: - cluster-installation/hosted-control-plane/index.md - KubeVirt Networking: cluster-installation/hosted-control-plane/kubevirt-networking.md + - Tenant Network: cluster-installation/hosted-control-plane/tenant-network/index.md - STACKIT: - cluster-installation/stackit/index.md - Nvidia GPU: diff --git a/requirements.txt b/requirements.txt index 3502fd9b..56da76a8 100644 --- a/requirements.txt +++ b/requirements.txt @@ -8,5 +8,5 @@ git+https://github.com/fralau/mkdocs_macros_plugin.git@v1.3.7 # Only for pre-commit checks not for mkdocs it selfe pre-commit==4.0.1 mkdocs-git-authors-plugin==0.9.2 -mkdocs-drawio==1.8.2 +mkdocs-drawio==1.15.0 mike==2.1.3 diff --git a/run-local.sh b/run-local.sh index 25ab3bbc..83382995 100755 --- a/run-local.sh +++ b/run-local.sh @@ -1,4 +1,4 @@ podman run -ti --user 0 --rm \ -v $(pwd):/opt/app-root/src:z \ - -p 8080:8080 quay.io/openshift-examples/builder:202601121657 + -p 8080:8080 quay.io/openshift-examples/builder:202604300846