Following the Icinga Notifications beta announcement, we already had a more general post on how to get started and one going into the details of schedules. This week’s blog post is a follow up in this series and will describe incidents, escalations, and event rules in Icinga Notifications in more detail. In case you haven’t seen the first two referenced blog posts, you might want to have a look at them first, otherwise, you could miss out on the big picture.
Incidents
In Icinga Notifications, an incident is a central element that keeps track of the state of an ongoing problem with one monitored entity. For example, when Icinga 2 reports to Icinga Notifications that a host has entered a problem state, a corresponding incident for that object is created inside Icinga Notifications. Let’s have a look at how an incident is shown in Icinga Web:
You can see the affected object, the current recipients of notifications for this incident and the history of the incident. These recipients can change based on the configuration (more on that later) and manually using the “(Un)subscribe” and “(Un)manage” buttons that allow an Icinga Web user to add themselves to the recipient list or remove themselves from it. If you manually subscribe, this affects the current incident, so once that one is resolved, it won’t have an effect on future incidents. The idea of the manage functionality is to allow someone to designate themselves to be responsible for handling the incident. In this case, further notifications aren’t sent to other recipients unless they explicitly subscribed to the incident.
Escalations
Put simply, escalating an incident means notifying new recipients about it. For example, if an incident reaches a certain criticality or remains unresolved for longer than a certain time, it may be necessary to inform additional recipients about it. Both can be configured as an escalation in an event rule, so let’s have a look at them next.
Event Rules
Event rules are viewed and edited using the following three column view in Icinga Web:
On the left, you can select which objects this rule applies to. The filter options depend on the source, in case of Icinga 2, it’s currently possible to filter based on host groups (hostgroup/My HostGroup
) and service groups (hostgroup/My ServiceGroup
). Please note that the exact syntax might change in the future to allow for a more intuitive configuration.
Next, the view splits into multiple rows where each row defines an escalation. The middle column defines the trigger condition condition for the escalation. There can be one escalation without any condition which will trigger immediately when an incident is created. This allows defining the initial recipients. Additional escalation can be defined to trigger based on the incident reaching a specific age, criticality, or combinations thereof.
The last columns defines the recipients and optionally the communication channel used to reach them, if that’s not set here explicitly, the default set for the individual contact is used. These recipients can be individual contacts, contact groups, as well as schedules. The first two should be rather self-explaining, the latter allows to route notifications to different recipients at different times, for example to model on-call rotations. For schedules, there already is another in-depth blog post.
How all of it is connected
Every incident keeps track of some obvious information like the affected object, the current severity, and when the incident was created. Besides that, incidents also keep track of which rules have matched on the affected object. This implies that the rule will remain active for the incident even if the object is changed in a way that it no longer matches the object filter in that rule.
Additionally, incidents store a set of the current recipients for notifications. This set can be changed by users as mentioned before using the manage and subscribe functionality in Icinga Web as well as by escalations configured in event rules. Triggering an escalation basically has the effect of taking the recipients given in the event rule configuration for that escalation and adding them to the recipient set of the corresponding incident. Note that escalations are triggered once per incident, that means once the escalation condition was satisfied once, those recipients recipients will stay recipients for the duration of an incident.