Icinga 2 Insights With Event Streams

by | Apr 2, 2025

There are many ways to interact with the data that Icinga 2 collects, processes, and produces. The most common is probably Icinga Web, which displays checks in all the colors of a traffic light. Icinga 2 also comes with several metrics or performance data writers.

But that is not all. Icinga 2 has open interfaces to integrate all kinds of third-party tools if one is not afraid to write a little glue code. While there are several ways to integrate other applications into Icinga 2, today’s focus is on the Icinga 2 API Event Streams.

Event Streams

The Icinga 2 API is a REST like HTTP API. Most requests receive a single JSON response and are done. Event Streams, however, are long-lived HTTP sessions where Icinga 2 keeps sending new information.

This information can be Check Results, State Changes, Downtimes and more. They can be of many types, which can be specified when establishing the connection.

Let’s start with a motivational example.

$ curl \
  -k -s -S -u "example:$pass" \
  -H 'Accept: application/json' \
  -X POST \
  'https://localhost:5665/v1/events' \
  -d '{ "queue": "demo", "types": [ "CheckResult" ] }'
{"acknowledgement":false,"check_result":{"active":true,"check_source":"icinga.example.com","command":["/usr/local/libexec/nagios/check_ping","-H","2001:db8::100","-c","5000,100%","-w","3000,80%"],"execution_end":1742832806.051293,"execution_start":1742832801.994177,"exit_status":0,"output":"PING OK - Packet loss = 0%, RTA
= 10.37 ms","performance_data":["rta=10.369000ms;3000.000000;5000.000000;0.000000","pl=0%;80;100;0;"],"previous_hard_state":0,"schedule_end":1742832806.051459,"schedule_start":1742832801.99,"scheduling_source":"icinga.example.com","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"alpha.example.com","timestamp":1742832806.052978,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"icinga.example.com","command":["/usr/local/libexec/nagios/check_ping","-6","-H","200a:db8::200","-c","200,15%","-w","100,5%"],"execution_end":1742832807.716859,"execution_start":1742832803.675325,"exit_status":0,"output":"PING OK - Packet loss = 0%, RTA = 0.28 ms","performance_data":["rta=0.284000ms;100.000000;200.000000;0.000000","pl=0%;5;15;0;"],"previous_hard_state":0,"schedule_end":1742832807.716977,"schedule_start":1742832803.67,"scheduling_source":"icinga.example.com","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"beta.example.com","service":"ping6","timestamp":1742832807.718469,"type":"CheckResult"}
{"acknowledgement":false,"check_result":{"active":true,"check_source":"icinga.example.com","command":["/usr/local/libexec/nagios/check_curl","-S","-H","foo.example.com","-I","200a:db8::42","-M","172800","-f","warning"],"execution_end":1742832809.889771,"execution_start":1742832809.635585,"exit_status":0,"output":"HTTP OK:
HTTP/1.1 200 OK - 73762 bytes in 0.220 second response time ","performance_data":["time=0.220277s;;;0.000000;10.000000","size=73762B;;;0;"],"previous_hard_state":0,"schedule_end":1742832809.889911,"schedule_start":1742832809.62,"scheduling_source":"icinga.example.com","state":0,"ttl":0,"type":"CheckResult","vars_after":{"attempt":1,"reachable":true,"state":0,"state_type":1},"vars_before":{"attempt":1,"reachable":true,"state":0,"state_type":1}},"downtime_depth":0,"host":"gamma.example.com","service":"https foo.example.com","timestamp":1742832809.895418,"type":"CheckResult"}

Maybe that was too much JSON to be motivating. But please bear with me.

Starting with the request, a JSON payload was sent to the /v1/events endpoint of the Icinga 2 API. The queue field is just a name for this request, try to make it unique. More importantly, the types are a filter for what kind of events we are interested in. In this case we are requesting CheckResults.

Looking at the response, there are three large JSON objects of type CheckResult. Each appeared immediately after Icinga 2 finished processing it. Unlike most API requests, which are pull-based, Event Streams push changes to us. So we have a live view of each check in our monitoring system.

As already mentioned, the Event Streams are not limited to CheckResults, but can also show other events. However, this post will keep its focus on the CheckResults.

Extracting Information

To get a better understanding, the first response gets a further inspection. For pretty-printing each JSON response, jq is a useful tool.

{
  "acknowledgement": false,
  "check_result": {
    "active": true,
    "check_source": "icinga.example.com",
    "command": [
      "/usr/local/libexec/nagios/check_ping",
      "-H",
      "2001:db8::100",
      "-c",
      "5000,100%",
      "-w",
      "3000,80%"
    ],
    "execution_end": 1742832806.051293,
    "execution_start": 1742832801.994177,
    "exit_status": 0,
    "output": "PING OK - Packet loss = 0%, RTA = 10.37 ms",
    "performance_data": [
      "rta=10.369000ms;3000.000000;5000.000000;0.000000",
      "pl=0%;80;100;0;"
    ],
    "previous_hard_state": 0,
    "schedule_end": 1742832806.051459,
    "schedule_start": 1742832801.99,
    "scheduling_source": "icinga.example.com",
    "state": 0,
    "ttl": 0,
    "type": "CheckResult",
    "vars_after": {
      "attempt": 1,
      "reachable": true,
      "state": 0,
      "state_type": 1
    },
    "vars_before": {
      "attempt": 1,
      "reachable": true,
      "state": 0,
      "state_type": 1
    }
  },
  "downtime_depth": 0,
  "host": "alpha.example.com",
  "timestamp": 1742832806.052978,
  "type": "CheckResult"
}

The outer object is similar for each Event Stream response, with the type field indicating which fields are present. The fields of each type are described in the documentation. Since the request was filtered for CheckResults, having the type set to CheckResult is no big surprise.

As noted in the CheckResult documentation, the host field holds the host name and the service field holds a service name if this is a service check. This pattern of an optional service field in the case of a service check is common for the Icinga 2 API. In our example there is no such field, only a host field. So we are dealing with a host check.

The check_result object contains the actual CheckResult value. It holds the output, exit_status, and state, which is the numeric representation of the host or service state: 0, 1, 2, 3 for OK, WARNING, CRITICAL, UNKNOWN or 0, 1 for UP, DOWN.

Additionally, if the check produces performance data, it will be present in the performance_data array. Each entry contains one performance metric data point, either as an Icinga PerfdataValue object or as a good old performance data metric string. Since our example runs check_ping from the Monitoring Plugins, its output is a text-based string that we can parse ourselves. In particular, there are two metrics, one for round trip average (rta) and one for packet loss (pl).

Integrate Other Tools

This level of insight into the internals of Icinga 2 makes it easier to integrate with other tools. For example, the new Icinga Notifications project listens to Event Streams when Icinga 2 is used as a notification source.

Event Streams may also allow new performance data writers to be developed outside Icinga 2. Access to CheckResults, which contains performance data for each check execution, should be sufficient. As a proof of concept, a Prometheus Remote Writer has been developed as this has been requested recently.

In a nutshell, such an Event Stream-powered performance data writer is a pipeline that consumes CheckResults, transforms data to create performance data metrics, and finally sends them to another database. This sample Prometheus Remote Writer does just that, with a bit of caching before bulk insertion of metrics.

Since I was already familiar with the Icinga Notifications implementation, such a program could be written in little time, especially when reusing existing code, which is possible since Icinga is Free Software.

Prometheus Metrics

Once the data arrives in Prometheus, it can either be used for plotting or for time-based queries. I explored the latter in a previous blog post, creating custom alerts based on linear trend predictions.

In general, there are areas where metrics are more useful and areas where checks are more useful. For example, a test-based check works best for comparing domain-specific information within the check. However, if many metrics are generated on the fly, it makes sense to collect and evaluate them later.

Getting Icinga 2 performance data into Prometheus may be useful, or it may just be a gimmick. If such a gimmick is what you need, feel free to play with this proof of concept. Otherwise, having such an API in Icinga 2 opens up integrations.

You May Also Like…

Subscribe to our Newsletter

A monthly digest of the latest Icinga news, releases, articles and community topics.