Skip to content

Bad naming in prometheus alerts #274

@LDiazN

Description

@LDiazN

In out current setup the Prometheus server is not aware that the monitoring host is a proxy and it thinks that it's its own host. As a result, alerts regarding machines behind the proxy are reported as issues in the proxy machine itself, obscuring useful information

For example, we usually get slack notifications like this:

[FIRING] monitoringproxy.dev.ooni.io:9200 is not up

And the only way of knowing what machine was affected is going to the Prometheus dashboard and looking for the one that is down, and if you don't catch it on time we lose the information of which one was broken

We should investigate how to add the EC2 host details to the alert

Metadata

Metadata

Assignees

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions