Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 38 additions & 6 deletions docs/04-For Operators/05-monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ sidebar_position: 5
## Logging

Logs are being collected by
[Promtail](https://grafana.com/docs/loki/latest/send-data/promtail/) and pushed
[Grafana Alloy](https://grafana.com/docs/alloy/latest/) and pushed
to a [Loki](https://grafana.com/docs/loki/latest/) instance running in the
control plane. Loki is deployed in
[monolithic mode](https://grafana.com/docs/loki/latest/setup/install/helm/install-monolithic/)
Expand All @@ -22,11 +22,43 @@ configuration parameters for the control plane in the control plane's
[logging](https://github.com/metal-stack/metal-roles/blob/master/control-plane/roles/logging/README.md)
role.

In the partitions, Promtail is deployed inside a systemd-managed Docker
container. Configuration parameters can be found in the partition's
[promtail](https://github.com/metal-stack/metal-roles/blob/master/partition/roles/promtail/README.md)
role. Which hosts Promtail collects from can be configured via the
`prometheus_promtail_targets` variable.
In the partitions, Alloy can be deployed inside a systemd-managed Docker
container on management servers and switches. Configuration parameters can be found in the partition's
[alloy](https://github.com/metal-stack/metal-roles/blob/master/partition/roles/alloy/README.md)
role.

### Partition Log Sources

Alloy is configured through snippets that define what logs are collected. The following snippets are typically used:

| Host type | Snippet | Description | Key labels |
| ---------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------- |
| Leaves, spines, exits | `journal` | Collects logs from the systemd journal; auto-discovers both volatile (`/run/log/journal`) and persistent (`/var/log/journal`) storage | `job=systemd-journal`, `unit`, `level` |
| Management servers | `journal-file` | Collects logs from the persistent systemd journal at a configurable path; supports migrating cursor position from promtail | `job=systemd-journal`, `unit`, `level` |
| Hosts without journald | `syslog` | Tails `/var/log/syslog` | `job=syslog` |
| Hosts running Docker | `docker` | Collects logs from all Docker containers via the Docker socket | `job=docker`, `container` |

All log entries carry the `host` and `partition` labels regardless of snippet, which makes it easy to filter logs in Grafana Explore by host or partition.

### Querying Logs in Grafana

Logs can be explored in Grafana using the **Explore** view with the Loki data source. Useful label filters:

- `{partition="<partition-id>"}` — all logs from a partition
- `{host="<hostname>"}` — all logs from a specific host
- `{job="docker", container="<name>"}` — logs from a specific Docker container
- `{job="systemd-journal", unit="<unit>.service"}` — logs from a specific systemd unit
- `{job="systemd-journal", level="error"}` — error-level journal entries across all units

:::note Migrating from promtail

The `promtail` role is deprecated and replaced by the `alloy` role. Refer to the
[Migration from promtail](https://github.com/metal-stack/metal-roles/blob/master/partition/roles/alloy/README.md#migration-from-promtail)
section of the partition alloy role's README and the
[Migration from promtail](https://github.com/metal-stack/metal-roles/blob/master/control-plane/roles/logging/README.md#migration-from-promtail)
section of the control-plane logging role's README for step-by-step instructions.

:::

## Monitoring

Expand Down
2 changes: 1 addition & 1 deletion docs/04-For Operators/monitoring-stack.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.