Skip to content

TheSKBroook/Automated-Telemetry-Monitor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Icon

Automated Telemetry Monitor

A smart monitor transforming telemetry data into actions and real-time alerts!

Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About this project
  2. Architecture Overview
  3. How to use
  4. Pre-requisite, Install and Deploy
  5. Demonstration
  6. License

About this project

The Automated Telemetry Monitor is a powerful solution designed to monitor and manage telemetry data efficiently from target nodes. It integrates Prometheus, Grafana, and Alertmanager for data collection, visualization, and alerting.

With config and playbooks, the project can perform counter-action against the specified alert, achieving immediate control to prevent system failures or any performance degradation.

The project aims to...

  • Gather metrics to support further anaylysis
  • Provide a visual way to anaylyse data
  • Fire alerts with pre-defined rules with the option of sending via LINE
  • Event-driven actions against specific alert

The current version supports linux ubuntu system

(back to top)

Architecture Overview

(back to top)

How to use

💁 After deployment, all Docker containers will be running. You will be able to monitor your target by navigating to :

  • localhost:9090 -- prometheus
  • localhost:9093 -- alertmanager
  • localhost:3000 -- grafana

Adding Rules

You can customize your rules in an Excel file by either editing or replacing metrics_excel.xlsx with your own file in the update folder.

Important

While adding your own rules, please follow the expected format in the default Excel file.

  • Blue section are required to be filled in. ( A and B and M ~ R )
  • Enter Y or N in Enable section to enable rules.
  • To create both Warning and Critical severity rules, separating them with a new line, and apply the same approach to the expressions.

Excel Screenshot Excel Screenshot

After adding your rules, remember to update them in Prometheus by running:

cd update
ansible-playbook update-rule.yml -K

Adding Action Playbook

Feel free to add your own playbook in handle_alert.yml in update folder for event-driven actions.
Here is an easy template for you to follow :

- name: NAME_OF_YOUR_PLAY
  hosts: "{{ target }}"
  tags: ALERT_NAME + _Warning or _Critical
  tasks:
    - name: TASK_NAME
      # ... write your tasks here ..

Note

The variables listed below are included in the default request that Alertmanager sends to the webhook server.

general : status, alertname, instance, severity, description, summary, startsAt, endsAt, generatorURL, fingerprint
extra : job ( node-exporter ), groupname ( process-exporter )

Any other variables will need to be parsed from extravars in webhook.py

(back to top)

Pre-requisite

Download ansible for HOST ( 2.10 above )
--------- Will be used in configuring and deploying -----------

install pip:

sudo apt install python3-pip:

install ansible from apt and pip:

sudo apt install ansible
python3 -m pip install --user ansible
Download node_exporter and process_exporter for Targets ( via docker )

💁 If you want to install them in another way, check out node_exporter and process_exporter


-------- Installation Image -----------

  docker pull prom/node-exporterversion: '3.8'
  docker pull ncabatoff/process-exporter

-------- Configuration --------

mkdir config
nano config/config.yml
  process_names:
    - name: "{{.Comm}}"
      cmdline:
      - '.+'

-------- Docker Compose --------

docker-compose.yml

nano docker-compose.yml
  version: '3.8'
services:
  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    restart: unless-stopped
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.rootfs=/rootfs'
      - '--path.sysfs=/host/sys'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
    ports:
      - 9100:9100
  process-exporter:
    image: ncabatoff/process-exporter:latest
    container_name: process_exporter
    volumes:
      - /proc:/host/proc:ro
      - ./config:/config:ro
    command:
      - '-procfs=/host/proc'
      - '-config.path=/config/config.yml'
    restart: unless-stopped
    ports:
      - 9256:9256

(back to top)

🖥️ Install & Deploy

Flow Chart

-------- Installation -----------

Clone the project to your local repo

git clone https://github.com/TheSKBroook/Automated-Telemetry-Monitor.git

-------- Configuration -----------

Add or edit target information in inventory.ini located in the deploy-server folder:

example inventory.ini

[local]
localhost ansible_host=10.00.00.3 ansible_user=test token= ENTER_YOUR_LINE_TOKEN

[targets]
target1 ansible_host= 10.00.00.4 ansible_user=test
# You can add more targets by:
# target2 ansible_host=

Note

Get Line Notify token from Line. More information can be found here.

-------- Deployment -----------

In deploy_server directory :

ansible-playbook -i inventory.ini main.yml -K

(back to top)

Demonstration

Take a look at these screenshots to see it in action!

Prometheus

Prometheus

Query and explore your telemetry metrics in real time with Prometheus.


Grafana

Grafana

Visualize your data with customizable dashboards in Grafana.


Alertmanager

Alertmanager

Manage and view alert notifications efficiently with Alertmanager.


Line Notify

Line

Receive instant alerts on your LINE app for critical events.

Warning

Unfortunately, LINE Notify will end on March 31, 2025. We will be looking for alternative solutions to replace its functionality.

(back to top)

License

This project is licensed under the MIT License. See the LICENSE file for more details.

(back to top)

About

A demo for an automated telemetry multifunctional system.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors