Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions docs/en/preview/kubeblocks-for-clickhouse/03-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,20 @@ For direct pod addressing (replication traffic, ClickHouse Keeper communication)
{pod-name}.{cluster}-{shardComponentName}-headless.{namespace}.svc.cluster.local
```

## Automatic Failover

ClickHouse does not use a primary/replica role distinction at the application level — all replicas within a shard are equivalent and can serve queries. Recovery after a pod failure does not involve a role switch:

1. **A replica pod crashes** — the failed pod stops serving queries for its shard
2. **CH Keeper detects the lost connection** (`topology: cluster` only) — remaining replicas continue serving if at least one replica is healthy; Keeper tracks which data parts each replica holds
3. **KubeBlocks restarts the failed pod** — the InstanceSet controller schedules a pod restart
4. **Recovered replica reconnects to CH Keeper** — the pod re-registers and fetches data parts it missed during downtime from peer replicas automatically
5. **ClusterIP service is unchanged** — all replicas are equivalent; no endpoint update is needed; the recovered pod resumes receiving traffic once it passes its readiness check

:::note
Steps 2 and 4 require `topology: cluster` (CH Keeper deployed). In `topology: standalone`, no Keeper is present — inter-replica part synchronization is not available unless an external ZooKeeper or Keeper is configured.
:::

## System Accounts

KubeBlocks automatically manages the following ClickHouse system account. Passwords are auto-generated and stored in a Secret named `{cluster}-{component}-account-{name}`.
Expand Down
22 changes: 18 additions & 4 deletions docs/en/preview/kubeblocks-for-elasticsearch/03-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -41,18 +41,32 @@ Every Elasticsearch pod runs three main containers (plus three init containers o

Each pod mounts its own **PVC** for the Elasticsearch data directory (`/usr/share/elasticsearch/data`), providing independent persistent storage per node.

## Node Roles
## Node roles (Elasticsearch)

Elasticsearch supports multiple node roles, and KubeBlocks maps each role to a dedicated Component:
Elasticsearch nodes are configured with **roles** (master-eligible, data, ingest, coordinating, etc.). The sections below describe what each role means for **capacity and HA**.

| Node Role | Responsibility |
| Node role | Responsibility |
|-----------|----------------|
| **Master-eligible** | Participates in leader election; manages cluster state, index mappings, and shard allocation |
| **Data** | Stores shard data; handles indexing and search requests for its assigned shards |
| **Ingest** | Pre-processes documents before indexing via ingest pipelines |
| **Coordinating** (optional) | Routes client requests to the appropriate data nodes and aggregates results |

In smaller deployments, a single node type can hold all roles. For production, dedicated master, data, and ingest components improve stability and resource isolation.
In smaller deployments, one process can hold several roles. In production, splitting roles across nodes improves stability.

## Topologies and component names (ClusterDefinition)

In the **kubeblocks-addons** Elasticsearch chart, **`spec.topology`** selects a layout. KubeBlocks creates **one Component per entry** in that topology; component **names** are short labels (`master`, `dit`, `mdit`, …), while the **Elasticsearch role set** is defined inside the image/config for each layout.

| Topology (`spec.topology`) | Components created | Notes |
|---------------------------|-------------------|--------|
| `single-node` | `mdit` | Single-node layout |
| `multi-node` (chart default) | `master`, `dit` | Split layout: dedicated `master` Component plus `dit` Component for the remaining node group |
| `m-dit` | `master`, `dit` | Same component names as `multi-node`; chart distinguishes layouts for ordering/defaults |
| `mdit` | `mdit` | Combined multi-role naming under one component |
| `m-d-i-t` | `m`, `d`, `i`, `t` | Dedicated components per role family (master / data / ingest / coordinating) |

Service names look like `{cluster}-{component}-http` — the **`{component}`** segment is the KubeBlocks Component name above (for example `mdit`, `master`, `dit`), not the long English phrase “master-eligible”. Use your Cluster’s `status` or `kubectl get component -n <ns>` to see the exact names for a running cluster.

## High Availability via Cluster Coordination

Expand Down
4 changes: 4 additions & 0 deletions docs/en/preview/kubeblocks-for-kafka/03-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@ KubeBlocks supports three Kafka deployment topologies:

The `*_monitor` variants add a standalone `kafka-exporter` Component that scrapes Kafka-specific metrics (consumer group lag, partition offsets, topic throughput) and exposes them on port 9308 for Prometheus.

:::note
**Configuration templates and `configs`:** the Kafka `ComponentDefinition` treats main config slots (for example **`kafka-configuration-tpl`**) as **externally managed** in current addon charts. When you create a `Cluster`, you must **wire those slots** by setting **`configs`** on the matching component (or sharding template) to ConfigMaps whose keys match the template file names — typically the ConfigMaps shipped with the addon in **`kb-system`**, or your own copies in the application namespace. If provisioning fails with a message about missing templates, compare your manifest to the **Kafka examples** in [kubeblocks-addons](https://github.com/apecloud/kubeblocks-addons/tree/main/examples/kafka) for the same chart version.
:::

---

## Combined Architecture (combined / combined_monitor)
Expand Down
22 changes: 22 additions & 0 deletions docs/en/preview/kubeblocks-for-milvus/03-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -114,10 +114,32 @@ Only the `proxy` container exposes port 19530 (gRPC) for client traffic. All com
| **Segment redundancy** | Sealed segments persist in MinIO; a restarted QueryNode reloads them from object storage without data loss |
| **MixCoord recovery** | MixCoord is stateless against etcd — it reloads all coordinator state from etcd on restart |

### MixCoord Recovery

MixCoord is the only single-replica component in the Distributed topology. When it crashes, KubeBlocks automatically recovers it without data loss:

1. **MixCoord pod crashes** — coordinator functions (DDL, segment lifecycle, query assignments, index scheduling) are temporarily unavailable
2. **KubeBlocks InstanceSet detects the pod failure** and schedules a restart
3. **New MixCoord pod starts** and reloads all coordinator state from etcd — no data is lost because MixCoord is fully stateless against etcd
4. **Worker nodes reconnect** — QueryNode, DataNode, and IndexNode pods reconnect to the restored MixCoord and resume their assigned work
5. **Cluster resumes serving requests** through the Proxy

Worker nodes (QueryNode, DataNode, IndexNode) run multiple replicas and tolerate individual pod failures without coordinator involvement — KubeBlocks restarts the failed pod, and MixCoord reassigns its work to healthy replicas.

### Traffic Routing

| Service | Type | Port | Selector |
|---------|------|------|----------|
| `{cluster}-proxy` | ClusterIP | 19530 (gRPC), 9091 (metrics/health) | proxy pods |

Client applications (Milvus SDK) connect to the proxy on port 19530 (gRPC). Port 9091 is the metrics/health endpoint — it is not a client-facing REST API. The proxy is the single entry point — it handles authentication, routing, and result aggregation across worker components.

## System Accounts

In the Milvus add-on, only the **in-cluster MinIO object-storage** Component (ComponentDefinition `milvus-minio`) declares KubeBlocks **`systemAccounts`**. Other Milvus stack components in this add-on (for example **etcd**, **milvus**, **proxy**, **mixcoord**, DataNode, QueryNode, IndexNode) do **not** define `systemAccounts` in their ComponentDefinitions. If you use an external object store instead of the bundled MinIO, this managed account does not apply to that store.

For the bundled MinIO component (typically named **`minio`** in `componentSpecs`, for example in standalone topology), KubeBlocks creates one account. Passwords are auto-generated unless overridden at the Cluster level. Credentials are stored in a Secret named **`{cluster}-minio-account-admin`** when the component name is `minio` (substitute your Cluster `metadata.name` and the MinIO component’s `name`).

| Account | Component (typical name) | Role | Purpose |
|---------|--------------------------|------|---------|
| `admin` | `minio` | Object store admin | MinIO root credentials; injected into MinIO pods as `MINIO_ACCESS_KEY` and `MINIO_SECRET_KEY` for S3-compatible access to buckets used by Milvus |
12 changes: 11 additions & 1 deletion docs/en/preview/kubeblocks-for-mongodb/03-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,17 @@ MongoDB replica sets use **oplog-based replication** and a **majority-vote (Raft
| **Election** | When the primary fails, secondaries vote; the candidate with the most up-to-date oplog and a majority of votes wins |
| **Write concern** | `w:majority` ensures a write is durable on a quorum before acknowledging |

A 3-member replica set tolerates **1 failure**. Failover typically completes within **10–30 seconds**.
A 3-member replica set tolerates **1 failure**.

### Automatic Failover

1. **Primary pod crashes or becomes unreachable** — secondaries stop receiving heartbeat pings
2. **Election timeout** — after approximately 10 seconds (`electionTimeoutMillis`), one secondary calls for an election
3. **Majority vote** — the candidate with the most up-to-date oplog and a majority of votes wins and becomes the new primary
4. **KubeBlocks roleProbe detects the change** — `syncerctl getrole` returns `primary` for the new pod → `kubeblocks.io/role=primary` label is applied
5. **Service endpoints switch** — the `{cluster}-mongodb-mongodb` ClusterIP service automatically routes writes to the new primary

Failover typically completes within **10–30 seconds**.

### Traffic Routing

Expand Down
20 changes: 19 additions & 1 deletion docs/en/preview/kubeblocks-for-mysql/03-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,16 @@ Each pod also runs multiple init containers on startup: `init-syncer` (copies sy
| **Failover trigger** | syncer roleProbe fails repeatedly → KubeBlocks selects replica with most advanced binlog position |
| **Promotion** | KubeBlocks calls the switchover API to promote the chosen replica; remaining replicas repoint to new primary |

### Automatic Failover

1. **Primary pod crashes** — replicas stop receiving binlog events
2. **syncer roleProbe fails** — `syncerctl getrole` returns an error repeatedly; detection takes approximately 30 seconds
3. **KubeBlocks marks the primary unavailable** and selects the replica with the most advanced binlog position as the promotion candidate
4. **Chosen replica is promoted** — KubeBlocks calls the switchover lifecycle action; the replica stops replicating and takes over as primary
5. **Remaining replicas repoint** to the new primary via `CHANGE REPLICATION SOURCE TO`
6. **Pod label updated** — `kubeblocks.io/role=primary` applied to the new primary pod
7. **Service endpoints switch** — the `{cluster}-mysql` ClusterIP service automatically routes writes to the new primary

Failover typically completes within **30–60 seconds**.

### Traffic Routing
Expand Down Expand Up @@ -114,7 +124,15 @@ The roleProbe runs `/tools/syncerctl getrole` inside the `mysql` container. Each
| **syncer role update** | syncer roleProbe detects the new primary role → updates `kubeblocks.io/role` label → ClusterIP service endpoints switch |
| **Quorum tolerance** | 3-member group tolerates 1 failure; 5-member tolerates 2 |

Failover typically completes within **5–15 seconds**. Group-internal primary election is near-instant after expulsion; the subsequent `kubeblocks.io/role` label update and Service endpoint switch still depend on the syncer roleProbe cycle.
### Automatic Failover

1. **Primary pod becomes unreachable** — group communication times out for the failed member
2. **GCS expulsion** — the remaining members detect the failure via the Group Communication System (GCS) and expel the unreachable member
3. **Group elects a new PRIMARY** — the remaining certified secondaries autonomously elect a new primary; no external coordinator is needed
4. **syncer roleProbe detects the new PRIMARY** — `syncerctl getrole` returns `primary` for the elected pod → `kubeblocks.io/role=primary` label updated
5. **Service endpoints switch** — the `{cluster}-mysql` ClusterIP service automatically routes writes to the new primary

Failover typically completes within **5–15 seconds**. Group-internal primary election is near-instant after expulsion; the subsequent label update and service endpoint switch depend on the syncer roleProbe cycle.

### Traffic Routing

Expand Down
36 changes: 29 additions & 7 deletions docs/en/preview/kubeblocks-for-redis/03-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,19 +8,41 @@ sidebar_label: Architecture

import RedisArchitectureDiagram from '@/components/RedisArchitectureDiagram';
import RedisClusterArchitectureDiagram from '@/components/RedisClusterArchitectureDiagram';
import RedisStandaloneArchitectureDiagram from '@/components/RedisStandaloneArchitectureDiagram';

# Redis Architecture in KubeBlocks

KubeBlocks supports two distinct Redis architectures that serve different use cases:
The Redis **ClusterDefinition** exposes three patterns. They differ by topology name and whether **Redis Sentinel** is deployed:

| Architecture | Topology name | Use Case |
|---|---|---|
| **Sentinel** | `replication` (also: `standalone`, `replication-twemproxy`) | Single-shard HA; simple client compatibility; datasets that fit on one node |
| **Redis Cluster** | `cluster` | Horizontal write/read scaling; datasets too large for a single node; high-throughput workloads |
| Pattern | Topology name | Components | HA / failover |
|---|---|---|---|
| **Standalone** | `standalone` | `redis` only | No Sentinel; **no** automatic primary failover |
| **Sentinel (primary + replicas)** | `replication`, `replication-twemproxy` | `redis` + `redis-sentinel` | Sentinel quorum monitors the primary and drives failover |
| **Redis Cluster (sharded)** | `cluster` | Sharding (`shard`) | Gossip on the cluster bus; **no** Sentinel processes |

---

## Sentinel Architecture
## Standalone architecture

Use **`topology: standalone`** when you want a **single Redis Component** and do **not** provision Sentinel.

<RedisStandaloneArchitectureDiagram />

```
Cluster → Component (redis) → InstanceSet → Pod × N
```

- There is **no** `redis-sentinel` Component and **no** Sentinel quorum — the monitoring and failover flow in the [Sentinel architecture](#sentinel-architecture) section does **not** apply.
- Pod layout (Redis server + metrics sidecar, PVC per data pod) matches the **Redis data pods** description under Sentinel below.
- For HA with automatic failover on a single shard, use **`replication`** (or **`replication-twemproxy`** with Twemproxy), not `standalone`.

---

## Sentinel architecture

:::note
Everything in this section applies to **`replication`** and **`replication-twemproxy`** only. It does **not** apply to **`standalone`** (no Sentinel) or **`cluster`** (Redis Cluster / gossip, no Sentinel).
:::

Redis Sentinel uses a dedicated set of Sentinel processes to monitor the Redis primary, detect failures, and coordinate automatic failover. All data lives on a single primary; replicas serve as hot standbys and optional read targets.

Expand Down Expand Up @@ -188,7 +210,7 @@ For external access, per-pod NodePort or LoadBalancer services (`redis-advertise

## System Accounts

KubeBlocks manages the following Redis account for both architectures. The password is auto-generated and stored in a Secret named `{cluster}-{component}-account-default`.
KubeBlocks manages the following Redis account for all topology patterns on this page (`standalone`, Sentinel, and Redis Cluster). The password is auto-generated and stored in a Secret named `{cluster}-{component}-account-default`.

| Account | Role | Purpose |
|---------|------|---------|
Expand Down
13 changes: 13 additions & 0 deletions docs/en/preview/kubeblocks-for-rocketmq/03-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -133,3 +133,16 @@ When a RocketMQ component fails:
3. **KubeBlocks pod restart** — KubeBlocks detects the failed pod and restarts it; the recovered pod resumes as master (brokerId=0) after the process starts
4. **Master re-registers** — the recovered master re-registers with all NameServer instances; topic routing is refreshed
5. **Slaves reconnect** — slaves re-establish the HA replication connection to the restored master on port 10912 and resync any missed log entries

## System Accounts

The RocketMQ add-on declares KubeBlocks **`systemAccounts`** on the **broker** and **Dashboard** ComponentDefinitions. NameServer and other components do not use this mechanism in the same way. Passwords are generated according to each account’s policy unless overridden on the Cluster.

Secrets follow **`{cluster}-{component}-account-{accountName}`** — where `{component}` is each component’s **`name`** in the Cluster spec (for example **`dashboard`** for the Dashboard, and each **broker shard** component name such as **`broker-0`**, **`broker-1`**, …).

| Account | Component (typical) | Role | Purpose |
|---------|---------------------|------|---------|
| `rocketmq-admin` | Per broker shard (`broker-*`) | Broker admin / ACL user | Injected into broker pods as `ROCKETMQ_USER` and `ROCKETMQ_PASSWORD` for broker authentication configuration |
| `console-admin` | `dashboard` | Dashboard login | Injected into Dashboard pods as `CONSOLE_USER` and `CONSOLE_PASSWORD` for the RocketMQ Dashboard web UI |

The Dashboard also needs the broker admin identity to talk to the cluster: it reads **`rocketmq-admin`** credentials from the broker ComponentDefinition via **`credentialVarRef`** (same username/password as the broker shard’s `rocketmq-admin` account).
8 changes: 8 additions & 0 deletions docs/en/preview/kubeblocks-for-zookeeper/03-architecture.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -105,3 +105,11 @@ When a ZooKeeper ensemble member fails:
2. **Leader election** (if the lost member was the leader) — surviving members elect a new leader in milliseconds to seconds
3. **Write continuity** — as long as a quorum remains available, all write and read operations continue normally
4. **Pod recovery** — when the failed pod restarts, it reads its `myid` from the PVC, contacts the leader, and syncs any missed transactions before rejoining the ensemble

## System Accounts

KubeBlocks manages the following ZooKeeper system account. The password is auto-generated and stored in a Secret named `{cluster}-{component}-account-admin` (replace `{cluster}` and `{component}` with your Cluster metadata.name and the ZooKeeper component name, typically `zookeeper`).

| Account | Role | Purpose |
|---------|------|---------|
| `admin` | Admin | Administrator user when ZooKeeper authentication is enabled (`ZOO_ENABLE_AUTH=yes`); credentials are injected into pods as `ZK_ADMIN_USER` and `ZK_ADMIN_PASSWORD` for authenticated client and administrative access |
Loading
Loading