Skip to content

Commit 43932be

Browse files
authored
add node manage to load balance (#1074)
1 parent 146d413 commit 43932be

File tree

8 files changed

+1118
-32
lines changed

8 files changed

+1118
-32
lines changed

src/UserGuide/Master/Table/User-Manual/Load-Balance.md

Lines changed: 138 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -225,13 +225,147 @@ After the migration is complete, the Region data in the system will be redistrib
225225
![](/img/cluster-extention-9-en.png)
226226

227227

228+
## 2. Node Management
229+
Node management is mainly used to remove and add ConfigNodes and DataNodes in a cluster. It is a basic operation to ensure cluster high availability and achieve load balancing.
228230

229-
## 2. Load Balance
231+
### 2.1 ConfigNode Maintenance
232+
ConfigNode maintenance includes two operations: adding and removing ConfigNodes. There are two common usage scenarios:
233+
234+
- **Cluster scaling**: When there is only 1 ConfigNode in the cluster and you want to increase the high availability of ConfigNodes, you can add 2 more ConfigNodes so that the cluster has 3 ConfigNodes.
235+
- **Cluster fault recovery**: When the machine hosting a ConfigNode fails and the ConfigNode cannot run properly, you can remove the faulty ConfigNode and add a new ConfigNode to the cluster.
236+
237+
> ❗️ Note: After completing ConfigNode maintenance, ensure the cluster has **1 or 3 normally running ConfigNodes**.
238+
> 2 ConfigNodes do not provide high availability, and more than 3 ConfigNodes will cause performance degradation.
239+
240+
#### 2.1.1 Adding a ConfigNode
241+
242+
**Script commands:**
243+
244+
```bash
245+
# Linux / MacOS
246+
# First switch to the IoTDB root directory
247+
sbin/start-confignode.sh
248+
249+
# Windows
250+
# First switch to the IoTDB root directory
251+
# Before V2.0.4.x
252+
sbin\start-confignode.bat
253+
254+
# V2.0.4.x and later
255+
sbin\windows\start-confignode.bat
256+
```
257+
258+
**Parameter description:**
259+
260+
| Param | Description | Required |
261+
|-------|-------------|----------|
262+
| -v | Show version information | No |
263+
| -f | Run the script in the foreground, not in the background | No |
264+
| -d | Start in daemon mode (run in the background) | No |
265+
| -p | Specify a file to store the process ID for process management | No |
266+
| -c | Specify the path of the configuration folder to load configuration files | No |
267+
| -g | Print detailed garbage collection (GC) information | No |
268+
| -H | Specify the path for Java heap dump files on JVM out-of-memory | No |
269+
| -E | Specify the path for JVM error log files | No |
270+
| -D | Define system properties in the format `key=value` | No |
271+
| -X | Directly pass `-XX` parameters to the JVM | No |
272+
| -h | Show help | No |
273+
274+
#### 2.1.2 Removing a ConfigNode
275+
First connect to the cluster via CLI and use `show confignodes` to confirm the NodeID of the ConfigNode to be removed:
276+
277+
```sql
278+
IoTDB> show confignodes
279+
+------+-------+---------------+------------+--------+
280+
|NodeID| Status|InternalAddress|InternalPort| Role|
281+
+------+-------+---------------+------------+--------+
282+
| 0|Running| 127.0.0.1| 10710| Leader|
283+
| 1|Running| 127.0.0.1| 10711|Follower|
284+
| 2|Running| 127.0.0.1| 10712|Follower|
285+
+------+-------+---------------+------------+--------+
286+
Total line number = 3
287+
It costs 0.030s
288+
```
289+
290+
Then remove the ConfigNode using the following SQL command:
291+
292+
```sql
293+
REMOVE CONFIGNODE [confignode_id];
294+
```
295+
296+
### 2.2 DataNode Maintenance
297+
There are two common scenarios for DataNode maintenance:
298+
299+
- **Cluster scaling**: Add new DataNodes to the cluster to expand cluster capacity.
300+
- **Cluster fault recovery**: When the machine hosting a DataNode fails and the DataNode cannot run properly, remove the faulty DataNode and add a new DataNode to the cluster.
301+
302+
> ❗️ Note: To ensure normal cluster operation, during and after DataNode maintenance, the number of normally running DataNodes must **not be less than the data replication factor (usually 2) or the metadata replication factor (usually 3)**.
303+
304+
#### 2.2.1 Adding a DataNode
305+
306+
**Script commands:**
307+
308+
```bash
309+
# Linux / MacOS
310+
# First switch to the IoTDB root directory
311+
sbin/start-datanode.sh
312+
313+
# Windows
314+
# First switch to the IoTDB root directory
315+
# Before V2.0.4.x
316+
sbin\start-datanode.bat
317+
318+
# V2.0.4.x and later
319+
tools\windows\start-datanode.bat
320+
```
321+
322+
**Parameter description:**
323+
324+
| Param | Description | Required |
325+
|-------|-------------|----------|
326+
| -v | Show version information | No |
327+
| -f | Run the script in the foreground, not in the background | No |
328+
| -d | Start in daemon mode (run in the background) | No |
329+
| -p | Specify a file to store the process ID for process management | No |
330+
| -c | Specify the path of the configuration folder to load configuration files | No |
331+
| -g | Print detailed garbage collection (GC) information | No |
332+
| -H | Specify the path for Java heap dump files on JVM out-of-memory | No |
333+
| -E | Specify the path for JVM error log files | No |
334+
| -D | Define system properties in the format `key=value` | No |
335+
| -X | Directly pass `-XX` parameters to the JVM | No |
336+
| -h | Show help | No |
337+
338+
**Note:** After adding a DataNode, as new writes arrive (and old data expires if TTL is set), the cluster load will gradually balance toward the new DataNode, eventually achieving balanced storage and computing resources across all nodes.
339+
340+
#### 2.2.2 Removing a DataNode
341+
First connect to the cluster via CLI and use `show datanodes` to confirm the NodeID of the DataNode to be removed:
342+
343+
```sql
344+
IoTDB> show datanodes
345+
+------+-------+----------+-------+-------------+---------------+
346+
|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum|
347+
+------+-------+----------+-------+-------------+---------------+
348+
| 1|Running| 0.0.0.0| 6667| 0| 0|
349+
| 2|Running| 0.0.0.0| 6668| 1| 1|
350+
| 3|Running| 0.0.0.0| 6669| 1| 0|
351+
+------+-------+----------+-------+-------------+---------------+
352+
Total line number = 3
353+
It costs 0.110s
354+
```
355+
356+
Then remove the DataNode using the following SQL command:
357+
358+
```sql
359+
REMOVE DATANODE [datanode_id];
360+
```
361+
362+
363+
## 3. Load Balance
230364

231365
Region migration belongs to advanced operations and maintenance functions, which have certain operational costs. It is recommended to read the entire document before using this function. If you have any questions about the solution design, please contact the IoTDB team for technical support.
232366

233367

234-
### 2.1 Feature introduction
368+
### 3.1 Feature introduction
235369

236370
IoTDB is a distributed database, and the balanced distribution of data plays an important role in load balancing the disk space and write pressure of the cluster. Region is the basic unit for distributed storage of data in IoTDB cluster, and the specific concept can be seen in [region](../Background-knowledge/Cluster-Concept.md)
237371

@@ -242,14 +376,14 @@ Here is a schematic diagram of the region migration process :
242376

243377
![](/img/region%E8%BF%81%E7%A7%BB%E7%A4%BA%E6%84%8F%E5%9B%BE20241210.png)
244378

245-
### 2.2 Notes
379+
### 3.2 Notes
246380

247381
1. It is recommended to only use the Region Migration feature on IoTDB 1.3.3 and higher versions.
248382
2. Region migration is only supported when the consensus protocol is IoTConsus or Ratis (in iotdb system. properties, the `schema_region_consensus_protocol_class` and`data_region_consensus_protocol_class`).
249383
3. Region migration consumes system resources such as disk space and network bandwidth. It is recommended to perform the migration during periods of low business load.
250384
4. Under ideal circumstances, Region migration does not affect user-side read or write operations. In special cases, Region migration may block writes. For detailed identification and handling of such situations, please refer to the user guide.
251385

252-
### 2.3 Instructions for use
386+
### 3.3 Instructions for use
253387

254388
- **Grammar definition** :
255389

src/UserGuide/Master/Tree/User-Manual/Load-Balance.md

Lines changed: 136 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -224,14 +224,146 @@ After the migration is complete, the Region data in the system will be redistrib
224224

225225
![](/img/cluster-extention-9-en.png)
226226

227+
## 2. Node Management
228+
Node management is mainly used to remove and add ConfigNodes and DataNodes in a cluster. It is a basic operation to ensure cluster high availability and achieve load balancing.
227229

230+
### 2.1 ConfigNode Maintenance
231+
ConfigNode maintenance includes two operations: adding and removing ConfigNodes. There are two common usage scenarios:
228232

229-
## 2. Load Balance
233+
- **Cluster scaling**: When there is only 1 ConfigNode in the cluster and you want to increase the high availability of ConfigNodes, you can add 2 more ConfigNodes so that the cluster has 3 ConfigNodes.
234+
- **Cluster fault recovery**: When the machine hosting a ConfigNode fails and the ConfigNode cannot run properly, you can remove the faulty ConfigNode and add a new ConfigNode to the cluster.
235+
236+
> ❗️ Note: After completing ConfigNode maintenance, ensure the cluster has **1 or 3 normally running ConfigNodes**.
237+
> 2 ConfigNodes do not provide high availability, and more than 3 ConfigNodes will cause performance degradation.
238+
239+
#### 2.1.1 Adding a ConfigNode
240+
241+
**Script commands:**
242+
243+
```bash
244+
# Linux / MacOS
245+
# First switch to the IoTDB root directory
246+
sbin/start-confignode.sh
247+
248+
# Windows
249+
# First switch to the IoTDB root directory
250+
# Before V2.0.4.x
251+
sbin\start-confignode.bat
252+
253+
# V2.0.4.x and later
254+
sbin\windows\start-confignode.bat
255+
```
256+
257+
**Parameter description:**
258+
259+
| Param | Description | Required |
260+
|-------|-------------|----------|
261+
| -v | Show version information | No |
262+
| -f | Run the script in the foreground, not in the background | No |
263+
| -d | Start in daemon mode (run in the background) | No |
264+
| -p | Specify a file to store the process ID for process management | No |
265+
| -c | Specify the path of the configuration folder to load configuration files | No |
266+
| -g | Print detailed garbage collection (GC) information | No |
267+
| -H | Specify the path for Java heap dump files on JVM out-of-memory | No |
268+
| -E | Specify the path for JVM error log files | No |
269+
| -D | Define system properties in the format `key=value` | No |
270+
| -X | Directly pass `-XX` parameters to the JVM | No |
271+
| -h | Show help | No |
272+
273+
#### 2.1.2 Removing a ConfigNode
274+
First connect to the cluster via CLI and use `show confignodes` to confirm the NodeID of the ConfigNode to be removed:
275+
276+
```sql
277+
IoTDB> show confignodes
278+
+------+-------+---------------+------------+--------+
279+
|NodeID| Status|InternalAddress|InternalPort| Role|
280+
+------+-------+---------------+------------+--------+
281+
| 0|Running| 127.0.0.1| 10710| Leader|
282+
| 1|Running| 127.0.0.1| 10711|Follower|
283+
| 2|Running| 127.0.0.1| 10712|Follower|
284+
+------+-------+---------------+------------+--------+
285+
Total line number = 3
286+
It costs 0.030s
287+
```
288+
289+
Then remove the ConfigNode using the following SQL command:
290+
291+
```sql
292+
REMOVE CONFIGNODE [confignode_id];
293+
```
294+
295+
### 2.2 DataNode Maintenance
296+
There are two common scenarios for DataNode maintenance:
297+
298+
- **Cluster scaling**: Add new DataNodes to the cluster to expand cluster capacity.
299+
- **Cluster fault recovery**: When the machine hosting a DataNode fails and the DataNode cannot run properly, remove the faulty DataNode and add a new DataNode to the cluster.
300+
301+
> ❗️ Note: To ensure normal cluster operation, during and after DataNode maintenance, the number of normally running DataNodes must **not be less than the data replication factor (usually 2) or the metadata replication factor (usually 3)**.
302+
303+
#### 2.2.1 Adding a DataNode
304+
305+
**Script commands:**
306+
307+
```bash
308+
# Linux / MacOS
309+
# First switch to the IoTDB root directory
310+
sbin/start-datanode.sh
311+
312+
# Windows
313+
# First switch to the IoTDB root directory
314+
# Before V2.0.4.x
315+
sbin\start-datanode.bat
316+
317+
# V2.0.4.x and later
318+
tools\windows\start-datanode.bat
319+
```
320+
321+
**Parameter description:**
322+
323+
| Param | Description | Required |
324+
|-------|-------------|----------|
325+
| -v | Show version information | No |
326+
| -f | Run the script in the foreground, not in the background | No |
327+
| -d | Start in daemon mode (run in the background) | No |
328+
| -p | Specify a file to store the process ID for process management | No |
329+
| -c | Specify the path of the configuration folder to load configuration files | No |
330+
| -g | Print detailed garbage collection (GC) information | No |
331+
| -H | Specify the path for Java heap dump files on JVM out-of-memory | No |
332+
| -E | Specify the path for JVM error log files | No |
333+
| -D | Define system properties in the format `key=value` | No |
334+
| -X | Directly pass `-XX` parameters to the JVM | No |
335+
| -h | Show help | No |
336+
337+
**Note:** After adding a DataNode, as new writes arrive (and old data expires if TTL is set), the cluster load will gradually balance toward the new DataNode, eventually achieving balanced storage and computing resources across all nodes.
338+
339+
#### 2.2.2 Removing a DataNode
340+
First connect to the cluster via CLI and use `show datanodes` to confirm the NodeID of the DataNode to be removed:
341+
342+
```sql
343+
IoTDB> show datanodes
344+
+------+-------+----------+-------+-------------+---------------+
345+
|NodeID| Status|RpcAddress|RpcPort|DataRegionNum|SchemaRegionNum|
346+
+------+-------+----------+-------+-------------+---------------+
347+
| 1|Running| 0.0.0.0| 6667| 0| 0|
348+
| 2|Running| 0.0.0.0| 6668| 1| 1|
349+
| 3|Running| 0.0.0.0| 6669| 1| 0|
350+
+------+-------+----------+-------+-------------+---------------+
351+
Total line number = 3
352+
It costs 0.110s
353+
```
354+
355+
Then remove the DataNode using the following SQL command:
356+
357+
```sql
358+
REMOVE DATANODE [datanode_id];
359+
```
360+
361+
## 3. Load Balance
230362

231363
Region migration belongs to advanced operations and maintenance functions, which have certain operational costs. It is recommended to read the entire document before using this function. If you have any questions about the solution design, please contact the IoTDB team for technical support.
232364

233365

234-
### 2.1 Feature introduction
366+
### 3.1 Feature introduction
235367

236368
IoTDB is a distributed database, and the balanced distribution of data plays an important role in load balancing the disk space and write pressure of the cluster. Region is the basic unit for distributed storage of data in IoTDB cluster, and the specific concept can be seen in [region](../Background-knowledge/Cluster-Concept.md)
237369

@@ -242,14 +374,14 @@ Here is a schematic diagram of the region migration process :
242374

243375
![](/img/region%E8%BF%81%E7%A7%BB%E7%A4%BA%E6%84%8F%E5%9B%BE20241210.png)
244376

245-
### 2.2 Notes
377+
### 3.2 Notes
246378

247379
1. It is recommended to only use the Region Migration feature on IoTDB 1.3.3 and higher versions.
248380
2. Region migration is only supported when the consensus protocol is IoTConsus or Ratis (in iotdb system. properties, the `schema_region_consensus_protocol_class` and`data_region_consensus_protocol_class`).
249381
3. Region migration consumes system resources such as disk space and network bandwidth. It is recommended to perform the migration during periods of low business load.
250382
4. Under ideal circumstances, Region migration does not affect user-side read or write operations. In special cases, Region migration may block writes. For detailed identification and handling of such situations, please refer to the user guide.
251383

252-
### 2.3 Instructions for use
384+
### 3.3 Instructions for use
253385

254386
- **Grammar definition** :
255387

0 commit comments

Comments
 (0)