Automatic health monitoring for all HaLOS containers #98
mairas
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
While investigating an unrelated issue, I noticed that Signal K Server hangs on Halos if I try to restart it using the Restart button in the UI. This was unexpected, and based on some investigation, Claude insisted that this is a new bug: #2589. Regardless, it highlighted the issue that Halos should have health checks enabled in all essential containers, and unhealthy containers should be auto-restarted. So, I did that.
All core and marine containers now have Docker healthchecks, paired with an autoheal watchdog that automatically restarts unresponsive services. This covers Traefik, Homarr, Authelia, Valkey, Grafana, InfluxDB, Signal K, and AvNav.
Each container is probed every 10-30 seconds using its built-in health endpoint. If a service fails three consecutive checks, autoheal restarts it automatically.
To update: Cockpit -> Packages -> Upgrade All or
sudo apt update && sudo apt upgradeBeta Was this translation helpful? Give feedback.
All reactions