These days many setups have a lot of redundancy and you may not want to send notifications during the night, just because one of multiple http servers has a problem. This blog post will show you how to setup a single service with a state combining multiple other services.
Preparation
Before we begin we’ll need a few services to combine the states from. For this example we will just create 20 hosts and apply a service to them. Note the http_cluster variable, this variable will be used later to assign our services to the combined service.
for (id in range(20)) { object Host "http-host-" + id { check_command = "dummy" vars.dummy_state = 0 vars.check_http = true // Needed for our combined service vars.http_cluster = "http-cluster-1" } } apply Service "http" { check_command = "random" assign where host.vars.check_http }
We will also need a dummy host to assign our combined service to.
object Host "combined-host" { check_command = "dummy" vars.dummy_state = 0 }
Functions
Before we begin writing our service, we will also need to create some helper functions. stateToString() can convert our state integers (0-3) into the correct name of the state (OK, WARNING, CRITICAL and UNKNOWN).
function stateToString(state) { if (state == 0) { return "OK" } else if (state == 1) { return "WARNING" } else if (state == 2) { return "CRITICAL" } else if (state == 3) { return "UNKNOWN" } }
Our second function, getServiceStatesByHttpCluster() gets every services that belongs to our http cluster (vars.http_cluster) and returns the total count of services and the services names sorted by their state.
function getServiceStatesByHttpCluster(cluster) { // Prepare a dictionary for counting every service in our cluster and sorting them by state var services = { count = 0 serviceStates = { "0" = [], "1" = [], "2" = [], "3" = [], } } // Iterate over every service object for (var service in (get_objects(Service))) { // Check if the http_cluster of the services host matches our cluster if (service.host.vars.http_cluster == cluster) { // Increase our service count by one services.count += 1 // Get the the services current state var state = service.last_check_result.state // Add the full service name ("host!service") to corresponding array in services.serviceStates services.serviceStates[state].add(service.host.name + "!" + service.name) } } // Return our "services" dictionary return services }
Service Object
After having prepared our functions and objects to calculate our state from, we can finally create our service. Here we’re using the internal check command dummy and it’s variables dummy_state and dummy_text. This allows is to assign functions that will be evaluated on every execution of our check.
object Service "combined-http" { check_command = "dummy" check_interval = 1m retry_interval = 30s host_name = "combined-host" // The http cluster we want to have combined states from (we've also set this variable on our services) vars.http_cluster = "http-cluster-1" // The minimum ratio of services that have to be in state OK (0.5 means at least 50% need to be OK) vars.ok_min_ratio = 0.5 // Store our current service object in variable to use it in function scope below var service = this // Functions stored in the variables dummy_state and dummy_text are evaluated on every execute of the check. vars.dummy_state = function() use (service) { // Get our services dictionary by calling our previously defined function var services = getServiceStatesByHttpCluster(service.vars.http_cluster) // Calculate the ratio of services with state OK compared to the total amount of services var ratio = services.serviceStates[0].len() / states.count // If the ratio is less then what we defined as our minimum, return CRITICAL as state, OK otherwise if (ratio < service.vars.ok_min_ratio) { return 2 } else { return 0 } } vars.dummy_text = function() use (service) { // Get our services dictionary by calling our previously defined function var services = getServiceStatesByHttpCluster(service.vars.http_cluster) // Calculate the ratio of services with state OK compared to the total amount of services var ratio = services.serviceStates[0].len() / states.count // Define an empty string variable which will later contain our status output var text = "" // If the ratio is less then what we defined as our minimum, add "CRITICAL: " to our output, "OK: " otherwise if (ratio < service.vars.ok_min_ratio) { text = "CRITICAL: " } else { text = "OK: " } // Add the amount of services in state OK and the total amount of services to our output (e.g. "5/20") text += services.serviceStates[0].len() + "/" + states.count + " OK\n" // Iterate over all state types and print the services with those states for (state in range(0, 3)) { // Check if we even have services with this state if (services.serviceStates[state].len() > 0) { // Add the state name to our output text += stateToString(state) + ":\n" // Iterate over all the services with this state and output their name for (serviceName in services.serviceStates[state]) { text += serviceName + "\n" } text += "\n" } } // Return the final output return text } }
Result
And finally, we have a services that combines the states of multiple services into one which can even be configured by changing the minimum OK state ratio.
This is a basic example on how to combine multiple service states into one service and serves as an example on what’s possible with the Icinga config language. This can be expanded on endlessly by adding custom filters and conditions for when the service should be in a specific state.
The shown approach is ideal for users focused on configuring Icinga 2 through config files. If that’s something you generally don’t want to do, take a look the Icinga Web 2 Business Process Module. It provides you with similar options, a dashboard and a graphical interface to help you with configuring your dependencies.