Icinga 2 v2.4.9 bugfix release

We’ve just released another minor version which fixes a few more bugs:

What’s New in Version 2.4.9

Bugfixes

  • Bug 11801 (Perfdata): Error: Function call ‘rename’ for file ‘/var/spool/icinga2/tmp/service-perfdata’ failed with error code 2, ‘No such file or directory’
  • Bug 11804 (Configuration): Segfault when trying to start 2.4.8
  • Bug 11807 (Compat): Command Pipe thread 100% CPU Usage

Icinga 2 v2.4.8 bugfix release

We’ve just released another new minor version for Icinga 2. Some of the highlights for this version include the ability to limit the maximum number of concurrent checks that are run by Icinga. There are also a rather large number of bugfixes.
Packages for 2.4.8 should be available shortly. Here’s the list of all changes for this version:

What’s New in Version 2.4.8

Changes

* Bugfixes
* Support for limiting the maximum number of concurrent checks (new configuration option)
* HA-aware features now wait for connected cluster nodes in the same zone (e.g. DB IDO)
* The ‘icinga’ check now alerts on failed reloads

Feature

  • Feature 8137 (Checker): Maximum concurrent service checks
  • Feature 9236 (Perfdata): PerfdataWriter: Better failure handling for file renames across file systems
  • Feature 9997 (libmethods): “icinga” check should have state WARNING when the last reload failed
  • Feature 10581 (ITL): Provide icingacli in the ITL
  • Feature 11556 (libbase): Add support for subjectAltName in SSL certificates
  • Feature 11651 (CLI): Implement SNI support for the CLI commands
  • Feature 11720 (ITL): ‘disk’ CheckCommand: Exclude ‘cgroup’ and ‘tracefs’ by default
  • Feature 11748 (Cluster): Remove unused cluster commands
  • Feature 11765 (Cluster): Only activate HARunOnce objects once there’s a cluster connection
  • Feature 11768 (Documentation): Add the category to the generated changelog

Bugfixes

  • Bug 9989 (Configuration): Service apply without name possible
  • Bug 10426 (libicinga): Icinga crashes with a segfault on receiving a lot of check results for nonexisting hosts/services
  • Bug 10717 (Configuration): Comments and downtimes of deleted checkable objects are not deleted
  • Bug 11046 (Cluster): Icinga2 agent gets stuck after disconnect and won’t relay messages
  • Bug 11112 (Compat): Empty author/text attribute for comment/downtimes external commands causing crash
  • Bug 11147 (libicinga): “day -X” time specifications are parsed incorrectly
  • Bug 11158 (libicinga): Crash with empty ScheduledDowntime ‘ranges’ attribute
  • Bug 11374 (API): Icinga2 API: deleting service with cascade=1 does not delete dependant notification
  • Bug 11390 (Compat): Command pipe overloaded: Can’t send external Icinga command to the local command file
  • Bug 11396 (API): inconsistent API /v1/objects/* response for PUT requests
  • Bug 11589 (libicinga): notification sent out during flexible downtime
  • Bug 11645 (Documentation): Incorrect chapter headings for Object#to_string and Object#type
  • Bug 11646 (Configuration): Wrong log severity causes segfault
  • Bug 11686 (API): Icinga Crash with the workflow Create_Host-> Downtime for the Host -> Delete Downtime -> Remove Host
  • Bug 11711 (libicinga): Expired downtimes are not removed
  • Bug 11714 (libbase): Crash in UnameHelper
  • Bug 11742 (Documentation): Missing documentation for event commands w/ execution bridge
  • Bug 11757 (API): API: Missing error handling for invalid JSON request body
  • Bug 11767 (DB IDO): Ensure that program status updates are immediately updated in DB IDO
  • Bug 11779 (API): Incorrect variable names for joined fields in filters

Avoiding Common Pitfalls with Apply Rules

When building apply rules for your Icinga 2 configuration there are a few common pitfalls you should avoid:
1. Using apply when you’re really just creating a single object
Rule-based configs are great at simplifying your config. However, there are times when you really just want to create a single object. One common anti-pattern I’ve come across is this:

apply Service "ntp" {
  ...
  assign where host.name == "ntp1.example.org"
}

Now, obviously this will work as intended, however there are two significant problems: Writing a filter rule for a single host is unnecessarily complicated and additionaly there is a significant performance penalty because this rule has to be evaluated for all of your hosts.
A much simpler way to achieve this is to just use a simple object declaration:

object Service "ntp" {
  host_name = "ntp1.example.org"
  ...
}

2. Using too many assign where rules
Apply rules are intended to be used to make your config more general by putting your hosts into certain classes (“all ntp servers”, “all database servers”, etc.) and then assigning services and notifications to each member of a certain class. However, for some reason people sometimes do this instead:

apply Service "web" {
  ...
  assign where host.name == "web1.example.org"
  assign where host.name == "web2.example.org"
  assign where host.name == "web3.example.org"
  ...
  assign where host.name == "web73.example.org"
}

The obvious problem here is that this is a maintenance nightmare – and as we’ve already learned “assign where” rules aren’t exactly free in terms of performance.
Unlike in our first example the solution isn’t to unroll this filter by creating an “object” definition for each of the hosts. Instead you should use some of the great filtering capabilities that come with Icinga 2. Here’s a short list of just some of the filters that are available:
1. CIDR matching (using the cidr_match function)
2. Regular expressions (using the regex function)
3. Wildcard matches (using the match function)
4. and last but not least: custom variables
In this particular example I’m going to use wildcard matching and custom variables:

object Host "web1.example.org" { }
object Host "web2.example.org" { }
object Host "web3.example.org" { vars.no_web_check = false }
apply Service "web" {
  ...
  assign where match("web*.example.org", host.name)
  ignore where host.vars.no_web_check
}

This assumes that all of your “web” hosts should have a “web” service by default. Using “ignore where” we can make sure that certain hosts don’t get the service.
3. Reusing “assign where” filters
This is pretty much the opposite problem compared to our previous example. Instead of using the same filter expression dozens of times in the same apply rule this is about unnecessarily repeating the filter in multiple apply rules:

apply Service "mysql" {
  ...
  // All db hosts except those in the dev subnet
  assign where match("db*.example.org", host.name) && !cidr_match("172.16.23.0/24", host.address)
}
apply Service "postgresql" {
  ...
  // All db hosts except those in the dev subnet
  assign where match("db*.example.org", host.name) && !cidr_match("172.16.23.0/24", host.address)
}
apply Service "mssql" {
  ...
  // All db hosts except those in the dev subnet
  assign where match("db*.example.org", host.name) && !cidr_match("172.16.23.0/24", host.address)
}

Code reuse is a best common practice when it comes to writing software. This also applies to Icinga 2 and makes your config much more maintainable and pleasant to work with.
Here’s how you can re-use your filter expression in multiple apply rules:

globals.is_prod_database_host = function(host) {
  // All db hosts except those in the dev subnet
  return match("db*.example.org", host.name) && !cidr_match("172.16.23.0/24", host.address)
}
apply Service "mysql" {
  ...
  assign where is_prod_database_host(host)
}
apply Service "postgresql" {
  ...
  assign where is_prod_database_host(host)
}
apply Service "mssql" {
  ...
  assign where is_prod_database_host(host)
}

By using descriptive function names you also gain the advantage of making your code… er, config more readable.

Icinga 2 Script Debugger

We’ve made an effort to make error messages as user-friendly as possible, however there are still cases where additional information is necessary to diagnose a problem. If you’ve ever spent time on figuring out where a configuration error comes from you’re going to love this new feature in Icinga 2.4.
Icinga 2.4 introduces a script debugger which can be used to inspect the state of scripts. For my first example I’m going to use the script debugger to figure out the problem with the following config:

object Host "web1" {
  import "generic-host"
  address = "192.168.2.36"
  vars.my_vhosts = {
    "www.icinga.org" = "y"
  }
}
apply Service "vhost " for (vhost_name => vhost_config in host.vars.my_vhosts) {
  import "generic-service"
  check_command = "http"
  vars.http_timeout = 30
  vars += vhost_config
}

Let’s have a look at a typical debugger session:

The script debugger’s prompt can be used to print the value of variables and other expressions. As you can see the vhost_config variable should be a dictionary, however in the host definition we have incorrectly set it to a string.
My next example shows how to use breakpoints to inspect variables in a custom attribute function:

object Service "test" {
  import "generic-service"
  host_name = "web1"
  check_command = "dummy"
  check_interval = 15s
  vars.dummy_text = {{
    var text = "Hello from " + host.name
    debugger
    return text
  }}
}


I hope you’ve enjoyed this quick introduction of the script debugger. More detailed information for this feature can be found in the documentation.

Understanding commands in Icinga 2

Icinga 2 command definitions can seem daunting at first. This blog post provides a quick introduction to some of the concepts you need to be familiar with when writing your own command definitions.
In their most basic form command definitions need a command line:

object CheckCommand "my_http" {
  import "plugin-check-command"
  command = [ PluginDir + "/check_http" ]
}

The “plugin-check-command” template tells Icinga how to execute commands, i.e. by executing an external plugin. There are a few other “*-check-command” templates but for virtually all of your own check commands you’ll need to use “plugin-check-command”.
The check_http plugin needs at least one more argument to work:

$ /opt/local/libexec/nagios/check_http -I 127.0.0.1
HTTP OK: HTTP/1.1 200 OK - 342 bytes in 0.001 second response time |time=0.001344s;;;0.000000 size=342B;;;0

We can add this argument to our check command like this:

object CheckCommand "my_http" {
  import "plugin-check-command"
  command = [ PluginDir + "/check_http" ]
  arguments = {
    "-I" = {
      value = "$my_http_address$"
      description = "IP address or name."
      required = true
    }
  }
  vars.my_http_address = "$address$"
}

The ‘required’ option tells Icinga to verify that the user specified a value for this argument.
We’re prefixing our custom attributes (my_http_address) with the name of the CheckCommand. This allows us to override specific custom attributes for HTTP checks on a per-host or per-service basis. If all
commands had the same custom attribute names (e.g. ‘timeout’) this wouldn’t be possible:

object Host "test" {
  ...
  // This affects all services on this host which use the my_http command
  vars.my_http_address = "127.0.0.1"
}

In our next step we’re going to add a few optional arguments. The check_http plugin lets us specify the ‘Host’ header and the URL that should be used. Adding optional arguments is rather simple:

    "-H" = {
      value = "$my_http_vhost$"
      description = "Host name argument for servers using host headers"
    }
    "-u" = {
      value = "$my_http_url$"
      description = "URL to GET or POST (default: /)"
    }

When Icinga encounters a command argument which uses an unresolvable macro (for example, because the user didn’t set a value for vars.my_http_vhost in their command, service or host) the entire argument is omitted.
Icinga can also add arguments only when certain conditions are met. In the next example I’m adding a new option ‘–sni’ which is only added when the custom attribute my_http_sni is set to true:

    "--sni" = {
      description = "Enable SSL/TLS hostname extension support (SNI)"
      set_if = "$my_http_sni$"
    }

Note that the ‘–sni’ option does not take an argument. Therefore we don’t need the ‘value’ attribute for this argument.
When the ‘my_http_sni’ custom attribute isn’t set at all Icinga defaults to not adding the argument.
There are a few more advanced topics for command arguments which aren’t covered in this blog post:

  • Ordering arguments
  • Using arrays for custom variables (with repeat_key/skip_key)
  • Using functions for set_if/value
  • Specifying an alternative ‘key’

I might write another blog post at a later point in time which deals with those features. In the meantime these things are explained in the documentation.