|
1 | | -# Healthcheck |
| 1 | +# Healthcheck 💔 |
2 | 2 |
|
3 | | -The app provides a healthcheck endpoint at `GET /health` that you can use to check the status of the app. The endpoint returns a JSON response. The response of the healthcheck differs for Local and Cloud/Private modes. Additionally, you can receive health information via the `system:ping` webhook. |
4 | | - |
5 | | -In all modes, a response code of `200` indicates normal app operation. |
6 | | - |
7 | | -The health information includes: |
| 3 | +The SMSGate provides healthcheck endpoints for monitoring the health of the app. The health information includes: |
8 | 4 |
|
9 | 5 | * **releaseId**: A unique identifier for the app release. |
10 | 6 | * **version**: The app version. |
11 | 7 | * **status**: The overall app status. |
12 | 8 | * **pass**: The app is running normally. |
13 | 9 | * **warn**: The app is running with some issues. |
14 | 10 | * **fail**: The app is not running normally. |
15 | | -* **checks**: A list of health checks performed by the app. |
| 11 | +* **checks**: A list of health checks performed by the app, depends on the server mode. |
16 | 12 | * **description**: A description of the health check. |
17 | 13 | * **observedUnit**: The unit of the observed value. |
18 | 14 | * **observedValue**: The value observed by the health check. |
19 | 15 | * **status**: The status of the health check. |
20 | 16 |
|
21 | | -## Local Mode |
| 17 | +!!! note "📈 Status Calculation" |
| 18 | + The overall health status is calculated as follows: |
22 | 19 |
|
23 | | -In Local mode, the healthcheck endpoint provides information about the device and the application. |
| 20 | + - **Default Status**: `pass` |
| 21 | + - **Status Levels**: |
| 22 | + - `pass` (0) - All checks passing |
| 23 | + - `warn` (1) - At least one warning |
| 24 | + - `fail` (2) - At least one failure |
| 25 | + - **Overall Status**: Determined by highest severity level across all checks |
24 | 26 |
|
25 | | -Example response: |
| 27 | +## Server Health ☁️ |
26 | 28 |
|
27 | | -```json |
28 | | -{ |
29 | | - "checks": { |
30 | | - "messages:failed": { |
31 | | - "description": "Failed messages for last hour", |
32 | | - "observedUnit": "messages", |
33 | | - "observedValue": 0, |
34 | | - "status": "pass" |
35 | | - }, |
36 | | - "connection:status": { |
37 | | - "description": "Internet connection status", |
38 | | - "observedUnit": "boolean", |
39 | | - "observedValue": 1, |
40 | | - "status": "pass" |
41 | | - }, |
42 | | - "connection:transport": { |
43 | | - "description": "Network transport type", |
44 | | - "observedUnit": "flags", |
45 | | - "observedValue": 4, |
46 | | - "status": "pass" |
47 | | - }, |
48 | | - "connection:cellular": { |
49 | | - "description": "Cellular network type", |
50 | | - "observedUnit": "index", |
51 | | - "observedValue": 0, |
52 | | - "status": "pass" |
53 | | - }, |
54 | | - "battery:level": { |
55 | | - "description": "Battery level in percent", |
56 | | - "observedUnit": "percent", |
57 | | - "observedValue": 94, |
58 | | - "status": "pass" |
59 | | - }, |
60 | | - "battery:charging": { |
61 | | - "description": "Is the phone charging?", |
62 | | - "observedUnit": "flags", |
63 | | - "observedValue": 4, |
64 | | - "status": "pass" |
65 | | - } |
66 | | - }, |
67 | | - "releaseId": 1, |
68 | | - "status": "pass", |
69 | | - "version": "1.0.0" |
70 | | -} |
71 | | -``` |
72 | | - |
73 | | -### Available health checks |
74 | | - |
75 | | -* **messages:failed**: The number of failed messages for the last hour. `warn` when there is at least one failed message and `fail` when all messages during the last hour have failed. |
76 | | -* **connection:status**: The status of the internet connection. `fail` when the Internet connection is not available. |
77 | | -* **connection:transport**: The transport type of the network connection. When the device is connected to multiple networks, only a single value is provided: |
78 | | - * 0: None |
79 | | - * 1: Unknown |
80 | | - * 2: Cellular |
81 | | - * 4: WiFi |
82 | | - * 8: Ethernet |
83 | | -* **connection:cellular**: The cellular network type. Available only if `connection:transport` has flag `2: Cellular`, otherwise `0: None`. |
84 | | - * 0: None |
85 | | - * 1: Unknown |
86 | | - * 2: Mobile2G |
87 | | - * 3: Mobile3G |
88 | | - * 4: Mobile4G |
89 | | - * 5: Mobile5G |
90 | | -* **battery:level**: The battery level in percent. `warn` when less than 25% and `fail` when less than 10%. |
91 | | -* **battery:charging**: The status of charging as bit flags. For example, if the device is charging via USB, the value will be `1 + 4 = 5`. |
92 | | - * 0: Not charging |
93 | | - * 1: Charging |
94 | | - * 2: AC charger connected |
95 | | - * 4: USB charger connected |
96 | | - |
97 | | -## Cloud Mode |
98 | | - |
99 | | -The health endpoint in cloud mode provides information about the server, not devices. |
100 | | - |
101 | | -If you need to receive device health information in Cloud or Private modes, you can use the `system:ping` webhook. |
102 | | - |
103 | | -Example response: |
104 | | - |
105 | | -```json |
106 | | -{ |
107 | | - "status": "pass", |
108 | | - "version": "v1.17.0", |
109 | | - "releaseId": 932, |
110 | | - "checks": { |
111 | | - "db:ping": { |
112 | | - "description": "Failed sequential pings count", |
113 | | - "observedValue": 0, |
114 | | - "status": "pass" |
115 | | - } |
116 | | - } |
117 | | -} |
118 | | -``` |
| 29 | +=== "📱 Local Server Mode" |
119 | 30 |
|
120 | | -The only provided health check is `db:ping`. It checks the database connectivity and counts failed sequential pings. |
| 31 | + In Local mode, the healthcheck endpoint provides information about the device and the application. |
121 | 32 |
|
122 | | -## Webhooks |
| 33 | + Example response: |
123 | 34 |
|
124 | | -In any mode, you can utilize the `system:ping` webhook to receive health information about the devices. The webhook payload will be the same as the device's healthcheck response. |
| 35 | + ```json |
| 36 | + { |
| 37 | + "checks": { |
| 38 | + "messages:failed": { |
| 39 | + "description": "Failed messages for last hour", |
| 40 | + "observedUnit": "messages", |
| 41 | + "observedValue": 0, |
| 42 | + "status": "pass" |
| 43 | + }, |
| 44 | + "connection:status": { |
| 45 | + "description": "Internet connection status", |
| 46 | + "observedUnit": "boolean", |
| 47 | + "observedValue": 1, |
| 48 | + "status": "pass" |
| 49 | + }, |
| 50 | + "connection:transport": { |
| 51 | + "description": "Network transport type", |
| 52 | + "observedUnit": "flags", |
| 53 | + "observedValue": 4, |
| 54 | + "status": "pass" |
| 55 | + }, |
| 56 | + "connection:cellular": { |
| 57 | + "description": "Cellular network type", |
| 58 | + "observedUnit": "index", |
| 59 | + "observedValue": 0, |
| 60 | + "status": "pass" |
| 61 | + }, |
| 62 | + "battery:level": { |
| 63 | + "description": "Battery level in percent", |
| 64 | + "observedUnit": "percent", |
| 65 | + "observedValue": 94, |
| 66 | + "status": "pass" |
| 67 | + }, |
| 68 | + "battery:charging": { |
| 69 | + "description": "Is the phone charging?", |
| 70 | + "observedUnit": "flags", |
| 71 | + "observedValue": 4, |
| 72 | + "status": "pass" |
| 73 | + } |
| 74 | + }, |
| 75 | + "releaseId": 1, |
| 76 | + "status": "pass", |
| 77 | + "version": "1.0.0" |
| 78 | + } |
| 79 | + ``` |
| 80 | + |
| 81 | + Available health checks: |
| 82 | + |
| 83 | + - **messages:failed**: The number of failed messages for the last hour. `warn` when there is at least one failed message and `fail` when all messages during the last hour have failed. |
| 84 | + - **connection:status**: The status of the internet connection. `fail` when the Internet connection is not available. |
| 85 | + - **connection:transport**: The transport type of the network connection. When the device is connected to multiple networks, only a single value is provided: |
| 86 | + * 0: None |
| 87 | + * 1: Unknown |
| 88 | + * 2: Cellular |
| 89 | + * 4: WiFi |
| 90 | + * 8: Ethernet |
| 91 | + - **connection:cellular**: The cellular network type. Available only if `connection:transport` has flag `2: Cellular`, otherwise `0: None`. |
| 92 | + * 0: None |
| 93 | + * 1: Unknown |
| 94 | + * 2: Mobile2G |
| 95 | + * 3: Mobile3G |
| 96 | + * 4: Mobile4G |
| 97 | + * 5: Mobile5G |
| 98 | + - **battery:level**: The battery level in percent. `warn` when less than 25% and `fail` when less than 10%. |
| 99 | + - **battery:charging**: The status of charging as bit flags. For example, if the device is charging via USB, the value will be `1 + 4 = 5`. |
| 100 | + * 0: Not charging |
| 101 | + * 1: Charging |
| 102 | + * 2: AC charger connected |
| 103 | + * 4: USB charger connected |
| 104 | + |
| 105 | +=== "⚙️ Cloud and Private Server Modes" |
| 106 | + |
| 107 | + The SMSGate server provides Kubernetes-compatible health check endpoints for monitoring service health. The system implements three dedicated endpoints following Kubernetes best practices, along with a legacy endpoint for backward compatibility with existing clients. |
| 108 | + |
| 109 | + For **Kubernetes deployments**, use the following endpoints: |
| 110 | + |
| 111 | + - **🔄 Liveness Probe**: `GET /health/live` |
| 112 | + - **Purpose**: Determine if the application is running correctly |
| 113 | + - **Response Format**: |
| 114 | + ```json |
| 115 | + { |
| 116 | + "status": "pass", |
| 117 | + "version": "1.33.0", |
| 118 | + "releaseId": "1234", |
| 119 | + "checks": { |
| 120 | + "system:goroutines": { |
| 121 | + "description": "Number of goroutines", |
| 122 | + "observedUnit": "goroutines", |
| 123 | + "observedValue": 15, |
| 124 | + "status": "pass" |
| 125 | + }, |
| 126 | + "system:memory": { |
| 127 | + "description": "Memory usage", |
| 128 | + "observedUnit": "MiB", |
| 129 | + "observedValue": 45, |
| 130 | + "status": "pass" |
| 131 | + } |
| 132 | + } |
| 133 | + } |
| 134 | + ``` |
| 135 | + - **🚦 Readiness Probe**: `GET /health/ready` |
| 136 | + - **Purpose**: Determine if the application is ready to accept traffic |
| 137 | + - **Response Format**: |
| 138 | + ```json |
| 139 | + { |
| 140 | + "status": "pass", |
| 141 | + "version": "1.33.0", |
| 142 | + "releaseId": "1234", |
| 143 | + "checks": { |
| 144 | + "db:ping": { |
| 145 | + "description": "Database ping", |
| 146 | + "observedUnit": "failed pings", |
| 147 | + "observedValue": 0, |
| 148 | + "status": "pass" |
| 149 | + } |
| 150 | + } |
| 151 | + } |
| 152 | + ``` |
| 153 | + - **🚀 Startup Probe**: `GET /health/startup` |
| 154 | + - **Purpose**: Determine if the application has completed its startup sequence |
| 155 | + - **Response Format**: Same as readiness probe |
| 156 | + |
| 157 | + !!! important "Migration Note" |
| 158 | + Additionally, a legacy endpoint (`GET /health`) is maintained for backward compatibility with existing clients. This endpoint uses the same logic as the readiness probe. |
| 159 | + |
| 160 | + !!! tip |
| 161 | + Use the `system:ping` webhook to monitor device health, as the server probes only monitor the application server itself. |
| 162 | + |
| 163 | +## Device Health 📱 |
| 164 | + |
| 165 | +The `system:ping` webhook provides device health information across all deployment modes. |
| 166 | + |
| 167 | +In Cloud/Private server deployments, the webhook provides device-level health information while the server probes monitor the application server itself. This separation ensures administrators can monitor both the application infrastructure and the connected devices. |
125 | 168 |
|
126 | 169 | Example payload: |
127 | 170 |
|
@@ -181,6 +224,6 @@ Example payload: |
181 | 224 |
|
182 | 225 | The webhook allows you to monitor the health of your devices in real-time, providing valuable information about message delivery, connectivity, and battery status. |
183 | 226 |
|
184 | | -## Links |
| 227 | +## See Also 📚 |
185 | 228 |
|
186 | 229 | * [Webhooks Guide](./webhooks.md) |
0 commit comments