Icinga 2: Host state calculation from all services

There is a variety of questions answered in the community support channels. Sometimes we just hack away fancy solutions directly inside the Icinga 2 DSL. Some of these examples are collected inside the documentation, others are posted on the community channels. Or they are just provided in hands-on workshops at customers waiting for sharing their stories to the world 🙂
This time there was the this question over at monitoring-portal.org – a host object collects a bunch of passive services and should calculate its overall state and output from the (worst) state of all referenced services.
Sounds easy. You could go for business process check returning the calculated value. Or you stick with many of the Icinga 2 configuration language features and put them altogether.

For a small test environment, I’ve generated 5 services using the random check (replace that with your real world scenario).
 

for (j in range(5)) {
  object Service "host-servicestatus-" + j {
    check_command = "random"
    check_interval = 30s
    retry_interval = 15s
    host_name = "host-servicestatus"
  }
}

The host object called “host-servicestatus” just uses the “dummy” check command provided by the ITL. This check command expects two custom attributes: “dummy_state” and “dummy_text”.
Now for the fun part – implement two lambda functions for these custom attributes using the available methods.

  vars.dummy_state = {{ ... }}
  vars.dummy_text = {{ ... }}

We want to calculate the worst state for all services on this specific host. Therefore we’ll use a temporary variable to save and update the worst state.

    var worst_state = 0

At first glance we want to selectively iterate over all service objects using the object accessor method “get_objects” with “Service” type. Then we’ll compare the service “host_name” attribute to the local scope (our host and its name). We’ll just skip all services not matching.

    for (s in get_objects(Service)) {
      if (s.host_name != host.name) {
        continue; //skip all services not referencing this host object
      }

The local to the loop variable “s” provides us with access to the all attributes for the current service object. Check whether its state is greater than 0 (not OK) and greater than the previously collected worst state. If so, store it in the local variable “worst_state”.

      if (s.state > 0 && s.state > worst_state) {
        worst_state = s.state
      }
    }

After the loop is finished, just return the “worst_state” variable for this function.

    return worst_state

In terms of generating an additional output text with all service names and their state, we’re using the same loop and conditional checking as above. Except we are using a temporary variable as an array of strings like this:

    var output = []

Inside the loop we’ll add the current service name and its state as string element to the “output” array.

      output.add(s.name + ": " + s.state)

Once the loop is finished, join the array elements with the separator “, ” concatenate the final output string and return it.

    return "Service summary: " + output.join(", ")

We could also concatenate the string as is but then we would need to think about the last loop run not adding the “,” character. The array join method just simplifies that step.
icinga2_host_servicestatus_web2The final solution works like a charm 🙂 If you say – hey I am not a coder – it helps to know Javascript, or Python or something similar of course. After all it is a pretty neat solution for helping a community member 🙂
 

object Host "host-servicestatus" {
  check_command = "dummy"
  vars.dummy_state = {{
    var worst_state = 0
    for (s in get_objects(Service)) {
      if (s.host_name != host.name) {
        continue; //skip all services not referencing this host object
      }
      if (s.state > 0 && s.state > worst_state) {
        worst_state = s.state
      }
    }
    return worst_state
  }}
  vars.dummy_text = {{
    var output = []
    for (s in get_objects(Service)) {
      if (s.host_name != host.name) {
        continue; //skip all services not referencing this host object
      }
      output.add(s.name + ": " + s.state)
    }
    return "Service summary: " + output.join(", ")
  }}
}

Icinga 2 v2.3.0 released

You may have heard it already – 2.3 adds lots of new features, for example object attribute accessors at runtime accompanied by functions, loops, conditionals and much more. Bringing you Icinga 2 v2.3.0 also means: 660 Git commits since 2.2.0, 94 features & 127 bug fixes.
While upgrading your Icinga 2 installation, you can test-drive the new language features in the new live console online on icinga.org. Grab a coffee, check additional feature details below, switch to the Changelog and once your upgrade has finished, get to work with the all new shiny Icinga 2 v2.3.0. The online documentation is currently undergoing changes, things to note: live search and removable tables of content tab.

Conditional statements

icinga2_2.3_conditions_icingaweb2This was frequently asked in the past: “How can I inherit values from the host to the service, and leave it to a default value if not set?” Consider it done with if-then-else-if-else conditions inside the Icinga 2 configuration language. The example shows a fallback to the host’s FQDN if its address6 attribute has not been set – cool, isn’t it?
 

Functions – what for?

You can now define your own functions including the return keyword. That includes locally scoped variables identified by the var keyword and anonymous lambda functions too. We’ve thought about functions and their use cases for Icinga 2. One thing we came up with is the Boolean return value for set_if inside command arguments – not only a macro string value, but also nearly any condition. Same applies for command argument values. The short way of assigning return values is putting them in double curly brackets {{ …. }}.
 

Loops, loops, loops, …

You’ve seen for loops already inside the fancy apply for rules introduced with 2.2. Using while and for loops including break and continue keywords has been experimental for Icinga Web 2’s Vagrant Box quite a while. In real-life scenarios you would use them in combination with functions and if-then-else conditions iterating over arrays and dictionaries defined in custom attributes.
 

Object attribute accessors for clustered checks

icinga2_2.3_object_runtime_attributes_cluster_bp_icingaweb2Digging up the old problem with so-called “on-demand macros” in Icinga 1.x and migration issues we’ve tackled this one differently: Instead of obfuscating the macro parser once again, the problem is solved differently – you’d want to access objects and their run-time attributes. Most commonly, get_host(NodeName).state for the old-fashioned check_cluster plugin. And many, many objects and attributes more …
That syntax could be used for cluster checks and business processes inside Icinga 2 – we’ll tackle the dummy check problem sooner or later, promise!
 

Time dependent thresholds

Right before the feature freeze one of our colleagues approached us with the request of time-dependent check threshold values. We’ve already had if-conditions and object accessor functions thus far, and so added the is_inside run-time attribute to the get_time_period() function. That way you can set thresholds depending on the current time of the day.

Type methods

Defined an array, but need it in a sorted manner? Remove a dictionary item inherited from a template? Split a string into parts? Not an issue anymore.

console CLI command

icinga_org_live_consoleTest all the language features inside the Icinga 2 console. Install rlwrap to keep history and line continuation and test-drive your new configuration before putting into the files.

Misc features

From OpenTSDB support to ignoring soft states in dependencies. Additional ITL plugin check commands (interfacetable, IPMI, webinject, vmware_esx, local ‘nscp client’ commands for the Windows agent). Livestatus header support, improved performance and additional bygroup tables. Improved cluster stability and scalability using fewer threads for socket I/O and SNI TLS support. ‘icinga2 troubleshoot’ cli command for better community support … check the Changelog below and in the documentation for more details.
PS: I’ve uploaded the configuration samples made for this blog post into the Vagrant boxes.
 

2.3.0 Changelog

  • Improved configuration validation
    • Unnecessary escapes are no longer permitted (e.g. \’)
    • Dashes are no longer permitted in identifier names (as their semantics are ambiguous)
    • Unused values are detected (e.g. { “-M” })
    • Validation for time ranges has been improved
    • Additional validation rules for some object types (Notification and User)
  • New language features
    • Implement a separate type for Boolean values
    • Support for user-defined functions
    • Support for conditional statements (if/else)
    • Support for ‘for’ and ‘while’ loops
    • Support for local variables using the ‘var’ keyword
    • New operators: % (modulo), ^ (xor), – (unary minus) and + (unary plus)
    • Implemented prototype-based methods for most built-in types (e.g. [ 3, 2 ].sort())
    • Explicit access to local and global variables using the ‘locals’ and ‘globals’ keywords
    • Changed the order in which filters are evaluated for apply rules with ‘for’
    • Make type objects accessible as global variables
    • Support for using functions in custom attributes
    • Access objects and their runtime attributes in functions (e.g. get_host(NodeName).state)
  • ITL improvements
    • Additional check commands were added to the ITL
    • Additional arguments for existing check commands
  • CLI improvements
    • Add the ‘icinga2 console’ CLI command which can be used to test expressions
    • Add the ‘icinga2 troubleshoot’ CLI command for collecting troubleshooting information
    • Performance improvements for the ‘icinga2 node update-config’ CLI command
    • Implement argument auto-completion for short options (e.g. daemon -c)
    • ‘node setup’ and ‘node wizard’ create backups for existing certificate files
  • Add ignore_soft_states option for Dependency object configuration
  • Fewer threads are used for socket I/O
  • Flapping detection for hosts and services is disabled by default
  • Added support for OpenTSDB
  • New Livestatus tables: hostsbygroup, servicesbygroup, servicesbyhostgroup
  • Include GDB backtrace in crash reports
  • Various documentation improvements
  • Solved a number of issues where cluster instances would not reconnect after intermittent connection problems
  • A lot of other, minor changes
  • DB IDO schema upgrade to 1.13.0 required!

Find the detailed Changelog in the “What’s new” section in the documentation!

Icinga 2: Using functions in custom attributes

Starting with version 2.2 Icinga supports arrays and dictionaries in custom attributes. In combination with apply this is incredibly powerful, e.g. to define HTTP vhosts for a host and to set up individual services for each of these vhosts.
Version 2.3 – which we’re planning to release on the 10th of March – will introduce support for using functions in custom attributes:

object CheckCommand "random-text" {
  import "plugin-check-command"
  command = [ PluginDir + "/check_dummy", "0", "$text$" ]
  vars.text = {{ Math.random() * 100 }}
}

The two curly braces are used to define a function. Icinga runs this function every time it needs the value for the custom attribute “text”. In this example this results in a new random value each time this check command is executed.
However, using functions we’re not limited to calculating simple values. Users can use if/else to accomplish more complex things:

vars.text = {{
  if (host.address == "127.0.0.1") {
    log("This is a check for localhost.")
  }
  return "Test"
}}

We can also access arbitrary attribute for other hosts and services:
 

vars.text = {{
  "The state for 'other-host' is: " + get_host("other-host").state
}}