Skip to content

Commit ac1a0b8

Browse files
committed
docs: document OTLPMetricsWriter feature
1 parent 88c657d commit ac1a0b8

3 files changed

Lines changed: 111 additions & 0 deletions

File tree

doc/06-distributed-monitoring.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2959,6 +2959,7 @@ By default, the following features provide advanced HA functionality:
29592959
* [Graphite](09-object-types.md#objecttype-graphitewriter)
29602960
* [InfluxDB](09-object-types.md#objecttype-influxdb2writer) (v1 and v2)
29612961
* [OpenTsdb](09-object-types.md#objecttype-opentsdbwriter)
2962+
* [OTLPMetrics](09-object-types.md#objecttype-otlpmetricswriter)
29622963
* [Perfdata](09-object-types.md#objecttype-perfdatawriter) (for PNP)
29632964

29642965
#### High-Availability with Checks <a id="distributed-monitoring-high-availability-checks"></a>

doc/09-object-types.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1865,6 +1865,43 @@ Configuration Attributes:
18651865
host_template | Dictionary | **Optional.** Specify additional tags to be included with host metrics. This requires a sub-dictionary named `tags`. Also specify a naming prefix by setting `metric`. More information can be found in [OpenTSDB custom tags](14-features.md#opentsdb-custom-tags) and [OpenTSDB Metric Prefix](14-features.md#opentsdb-metric-prefix). More information can be found in [OpenTSDB custom tags](14-features.md#opentsdb-custom-tags). Defaults to an `empty Dictionary`.
18661866
service_template | Dictionary | **Optional.** Specify additional tags to be included with service metrics. This requires a sub-dictionary named `tags`. Also specify a naming prefix by setting `metric`. More information can be found in [OpenTSDB custom tags](14-features.md#opentsdb-custom-tags) and [OpenTSDB Metric Prefix](14-features.md#opentsdb-metric-prefix). Defaults to an `empty Dictionary`.
18671867

1868+
### OTLPMetricsWriter <a id="objecttype-otlpmetricswriter"></a>
1869+
1870+
Emits metrics in [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/) format to a defined OpenTelemetry Collector
1871+
or any other OTLP-compatible backend that accepts OTLP data over HTTP. This configuration object is available as
1872+
[otlpmetrics feature](14-features.md#otlpmetrics-writer). You can find more information about OpenTelemetry and OTLP
1873+
on the [OpenTelemetry website](https://opentelemetry.io/).
1874+
1875+
A basic copy and pastable example configuration is shown below:
1876+
1877+
```
1878+
object OTLPMetricsWriter "otlp-metrics" {
1879+
host = "127.0.0.1"
1880+
port = 4318
1881+
metrics_endpoint = "/v1/metrics"
1882+
service_namespace = "icinga2-production"
1883+
}
1884+
```
1885+
1886+
There are more configuration options available as described in the table below.
1887+
1888+
| Name | Type | Description |
1889+
|--------------------------|------------|----------------------------------------------------------------------------------------------------------------------------------------------|
1890+
| host | String | **Required.** OTLP collector host address. Defaults to `127.0.0.1`. |
1891+
| port | Number | **Required.** OTLP collector HTTP port. Defaults to `4318`. |
1892+
| metrics\_endpoint | String | **Required.** OTLP metrics endpoint path. Defaults to `/v1/metrics`. |
1893+
| service\_namespace | String | **Required.** The namespace to associate with emitted metrics used in the `service.namespace` OTel resource attribute. Defaults to `icinga`. |
1894+
| basic\_auth | Dictionary | **Optional.** Username and password for HTTP basic authentication. |
1895+
| flush\_interval | Duration | **Optional.** How long to buffer data points before transferring to the OTLP collector. Defaults to `15s`. |
1896+
| flush\_threshold | Number | **Optional.** How many bytes to buffer before forcing a transfer to the OTLP collector. Defaults to `32MiB`. |
1897+
| enable\_ha | Boolean | **Optional.** Enable the high availability functionality. Has no effect in non-cluster setups. Defaults to `false`. |
1898+
| enable\_send\_thresholds | Boolean | **Optional.** Whether to stream warning, critical, minimum & maximum as separate metrics to the OTLP collector. Defaults to `false`. |
1899+
| diconnect\_timeout | Duration | **Optional.** Timeout to wait for any outstanding data to be flushed to the OTLP collector before disconnecting. Defaults to `10s`. |
1900+
| enable\_tls | Boolean | **Optional.** Whether to use a TLS stream. Defaults to `false`. |
1901+
| tls\_insecure\_noverify | Boolean | **Optional.** Disable TLS peer verification. Defaults to `false`. |
1902+
| tls\_ca\_file | String | **Optional.** Path to CA certificate to validate the remote host. |
1903+
| tls\_cert\_file | String | **Optional.** Path to the client certificate to present to the OTLP collector for mutual verification. |
1904+
| tls\_key\_file | String | **Optional.** Path to the client certificate key. |
18681905

18691906
### PerfdataWriter <a id="objecttype-perfdatawriter"></a>
18701907

doc/14-features.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ best practice is to provide performance data.
7373

7474
This data is parsed by features sending metrics to time series databases (TSDB):
7575

76+
* [OpenTelemetry](14-features.md#otlpmetrics-writer)
7677
* [Graphite](14-features.md#graphite-carbon-cache-writer)
7778
* [InfluxDB](14-features.md#influxdb-writer)
7879
* [OpenTSDB](14-features.md#opentsdb-writer)
@@ -751,6 +752,78 @@ mechanism ensures that metrics are written even if the cluster fails.
751752
The recommended way of running OpenTSDB in this scenario is a dedicated server
752753
where you have OpenTSDB running.
753754

755+
### OTLPMetrics Writer <a id="otlpmetrics-writer"></a>
756+
757+
The [OpenTelemetry Protocol (OTLP/HTTP)](https://opentelemetry.io/docs/specs/otlp/#otlphttp) metrics Writer feature
758+
allows Icinga 2 to send metrics to OpenTelemetry Collector or any other backend that supports the OTLP HTTP protocol,
759+
such as [Prometheus OTLP](https://prometheus.io/docs/guides/opentelemetry/) receiver,
760+
[Grafana Mimir](https://grafana.com/docs/mimir/latest/configure/configure-otel-collector/),
761+
[OpenSearch Data Prepper](https://docs.opensearch.org/latest/data-prepper/pipelines/configuration/sources/otlp-source/),
762+
etc. It enables seamless integration of Icinga 2 metrics into modern observability stacks, allowing you to leverage the
763+
capabilities of OpenTelemetry for advanced analysis and visualization of your monitoring data. OpenTelemetry provides a
764+
standardized way to collect, process, and export telemetry data, making it easier to integrate with numerous
765+
[monitoring and observability](https://opentelemetry.io/docs/collector/components/exporter/) tools effortlessly.
766+
767+
!!! note
768+
769+
This feature has successfully been tested with OpenTelemetry Collector, Prometheus OTLP receiver, OpenSearch Data
770+
Prepper, and Grafana Mimir. However, it should work with any backend that supports the OTLP HTTP protocol as well.
771+
772+
In order to enable this feature, you can use the following command:
773+
774+
```bash
775+
icinga2 feature enable otlpmetrics
776+
```
777+
778+
By default, the OTLPMetrics Writer expects the OpenTelemetry Collector or any other OTLP HTTP receiver to listen at
779+
`127.0.0.1` on port `4318` but most of the third-party backends use their own ports, so you may need to adjust the
780+
configuration accordingly. Additionally, the `metrics_endpoint` can vary based on the backend you are using.
781+
For example, OpenTelemetry Collector uses `/v1/metrics` (is the default), while the Prometheus OTLP receiver uses
782+
`/api/v1/otlp/v1/metrics`. Therefore, it is important to set the correct `metrics_endpoint` in the configuration file.
783+
784+
You can find more details about the configuration options [here](09-object-types.md#objecttype-otlpmetricswriter).
785+
786+
The generated metric names follow the OpenTelemetry naming conventions and cannot be customized by end-users and are
787+
therefore always the same across all Icinga 2 installations. The OTLP Writer currently sends the following metrics:
788+
789+
| Metric Name | Description |
790+
|---------------------------------|---------------------------------------|
791+
| state_check.perfdata | Performance data metrics from checks. |
792+
| state_check.thresholds.warning | Warning threshold values for checks. |
793+
| state_check.thresholds.critical | Critical threshold values for checks. |
794+
| state_check.thresholds.min | Minimum threshold values for checks. |
795+
| state_check.thresholds.max | Maximum threshold values for checks. |
796+
797+
By default, the writer will not stream any data point for the `state_check.thresholds.*` metrics. To enable the
798+
streaming of threshold metrics, you need to set the `enable_send_thresholds` option to `true` in the OTLPMetrics Writer
799+
configuration. Once enabled, it will send the threshold values for each performance data metric if they are available
800+
in the produced check results.
801+
802+
The data points type for all the above metrics is [`gauge`](https://opentelemetry.io/docs/specs/otel/metrics/data-model/#gauge)
803+
and the perfdata labels and their units (if available) are mapped OpenTelemetry metric points attributes. For example,
804+
a perfdata label `load1` with a value of `0.5` and unit `%` will be sent to the `state_check.perfdata` metric stream,
805+
with a metric point having a value of `0.5`, along with the attributes `label="load1"` and `unit="%"`. Additionally,
806+
each metric point will also include other relevant attributes such as `icinga2.host.name`, `icinga2.service.name`,
807+
`icinga2.command.name`, etc. as resource attributes. The complete list of data format and attributes can be obtained by
808+
letting the OpenTelemetry Collector log the received metrics either to the standard output or to a JSON file in a
809+
human-readable format.
810+
811+
At the moment, the OTLPMetrics Writer allows you to configure only a single metrics resource attribute
812+
[`service.namespace`](https://opentelemetry.io/docs/specs/semconv/registry/attributes/service/#service-namespace) via
813+
the `service_namespace` option in the OTLPMetrics Writer config. This attribute can be used to group related metrics
814+
together in the backend. By default, it is set to `icinga`. You can customize it to better fit your monitoring
815+
environment. For example, you might set it to `production`, `staging`, or any other relevant namespace that categorizes
816+
your Icinga 2 metrics emitted to the OpenTelemetry backend effectively.
817+
818+
#### OTLPMetrics in HA Cluster Zones <a id="otlpmetrics-writer-ha-cluster"></a>
819+
820+
This writer supports [High Availability (HA)](06-distributed-monitoring.md#distributed-monitoring-high-availability-features)
821+
cluster zones in Icinga 2. If you enable this feature on all of your cluster endpoints, each OTLPMetrics Writer will
822+
send metrics independently to the configured OTLP collector. In order to avoid duplicate metrics being sent from
823+
multiple cluster endpoints, it is recommended to set the `enable_ha` option to `true` in the OTLPMetrics Writer config
824+
on all cluster endpoints. This will ensure that only one writer in the cluster is active at any given time, sending
825+
metrics to the configured OTLP collector. The other OTLPMetrics Writer will remain in standby mode and ready to take
826+
over if the active endpoint fails or becomes unavailable for any reason.
754827

755828
### Writing Performance Data Files <a id="writing-performance-data-files"></a>
756829

0 commit comments

Comments
 (0)