[Prometheus] - Alerting management #1390

Open
opened 2026-05-05 09:30:07 +02:00 by muppeth · 1 comment
Owner

With the basics deployed and some alert rules in place, we need to make sure we have the following in place:

  • Make sure alerts are firing right away when critical issues occur
  • We make sure to group lower level alerts to avoid overwhelming information
  • Setup healthy silence rules
  • Improve alert rules with good severity assigment, good labeling and grouping
  • Setup better alert manager frontend (karma) to better visualize the issues

All this should provide us with a good system that is firing when critical, informative about other alerts, and sending alerts to different groups (when applicable) or directly to sysadmin on duty (when applicable).

With the basics deployed and some alert rules in place, we need to make sure we have the following in place: - Make sure alerts are firing right away when critical issues occur - We make sure to group lower level alerts to avoid overwhelming information - Setup healthy silence rules - Improve alert rules with good severity assigment, good labeling and grouping - Setup better alert manager frontend (karma) to better visualize the issues All this should provide us with a good system that is firing when critical, informative about other alerts, and sending alerts to different groups (when applicable) or directly to sysadmin on duty (when applicable).

Critical issues can also send a message to state@chat.disroot.org and/or being displayed on uptime-kuma

Critical issues can also send a message to state@chat.disroot.org and/or being displayed on uptime-kuma
Sign in to join this conversation.
No milestone
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Disroot/Disroot-Project#1390
No description provided.