Endpoint Monitoring with Icinga

by | Jul 30, 2025

Monitoring with Icinga primarily focuses on servers and infrastructure. But there are also the people operating these systems from their workstations and laptops. If a server can be accessed from a machine with an outdated operating system, the patch level of the server becomes irrelevant.

In computer security, the allegory of a chain and its weakest link is often used to describe the security of a system. This weakest link may be a person using a machine for various tasks, from systems administration to web browsing. Staying with this example, web browsers have become very complex pieces of technology, with security releases every other week. Ignoring these updates or even running an end-of-life operating system allows attackers to exploit these vulnerabilities and gain access to internal infrastructure.

Stating the obvious, the more access a person has, the greater the risk. Therefore, a secure system is just as important for the CEO as it is for an intern. But especially for the CEO.

And this is where endpoint monitoring becomes relevant. Due to conflicting definitions of the term, I am refining it to refer to the monitoring of end user workstations.

Endpoint monitoring is also often a requirement of an Information Security Management System (ISMS). There are multiple sections in ISO 27001 which can be interpreted in this regard, e.g., ISO 27001:2022 Annex A Control 5.7, Annex A Control 8.1, or Annex A Control 8.16.

Differences from Server Monitoring

Compared to monitoring a server, there are certain differences to be considered.

From a technical standpoint, there is little difference. A check command can be executed on either a server or a workstation. For example, check_apt always reports missing updates on a Debian system, regardless of running on a laptop or a server.

However, one distinction might be the reachability. While a server is usually online 24/7, a worker’s laptop may only be online from 9 to 5 and may not even always be connected to the Internet. Therefore, the monitoring must be configured to handle certain time periods of unavailability.

Another dimension is privacy, particularly if employees are permitted to use their devices outside of work. While a server may process personal data, the reachability of the machine alone can be considered to be relevant under some data protection laws. Thus, a Data Protection Impact Assessment (DPIA) should be performed for each check.

Workstations as Icinga Agents

An obvious starting point would be to configure the workstation as an Icinga 2 Host. After installing and configuring the Icinga 2 agent on this machine, it can connect to an Icinga 2 master to receive its monitoring configuration.

However, as mentioned above, the employee’s machine is most likely not available all the time. This is where TimePeriods step in. Under the simplified assumption that everyone works exactly from 9 to 5, the following TimePeriod can be defined.

object TimePeriod "workhours" {
  display_name = "Icinga 2 8x5 TimePeriod"
  ranges = {
    "monday"    = "09:00-17:00"
    "tuesday"   = "09:00-17:00"
    "wednesday" = "09:00-17:00"
    "thursday"  = "09:00-17:00"
    "friday"    = "09:00-17:00"
  }
}

This can be used in different places. For example, a Notification object has a period attribute, restricting notifications to this time period. The Icinga 2 master would experience many failed check attempts, but would not report them outside of work hours.

Another approach would be to set the check_period attribute for the Host and Service objects. In this case, Icinga 2 would not perform any checks outside of the defined TimePeriod.

However, this would still enforce strict times and would not allow flexible working hours. To get flexible, there are flexible downtimes. One such downtime could start around the end of the workday and last through the night.

Another possibility would be to define an EventCommand for the workstation’s Host object to automatically set an acknowledgement if the machine goes down. Since acknowledgements are automatically removed when the Host or Service recovers, this would allow even more flexibility. Another more advanced way would be to let the workstation itself trigger an acknowledgment when being shut down through the Icinga 2 API’s acknowledge-problem.

Of course, these options can be combined. Set the Host and Service check_period to a Timeperiod as shown above and then either add a scheduled downtime or trigger an acknowledgement. The multitude of possible options highlights Icinga 2’s flexibility as a platform. There are usually multiple ways to address corner cases such as this one.

Passive Checks from the Workstation

But it is not even necessary to install an Icinga 2 agent on each workstation. Especially if only a few checks need to be executed on a daily basis, the agent might come off as excessive.

In this case, external passive check results can be used, triggered by a simple, transparent shell script from each employee’s machine. This allows reversing the usual order and have the monitored machine deliver the check results to the Icinga 2 API with a single HTTPS request.

On the Icinga 2 master, an ApiUser is required first. This user needs the permission to process check results for its own Host and some Services to receive the passive check results. The services will use the dummy check command and get their result overridden. If nothing is received for, let’s say, seven days, the service goes UNKNOWN.

For example, create a user named Alice with her machine as a Host and Services for updates and backups. The logic behind these checks stays on Alice’s machine and for Icinga 2, they are just named dummy checks.

object ApiUser "passive-alice" {
  password = "insecure"
  permissions = [
    {
      permission = "actions/process-check-result"
      filter = {{ host.name == "laptop-alice" }}
    }
  ]
}

object Host "laptop-alice" {
  check_command = "dummy"
  check_interval = 1d

  vars += {
    "dummy_state" = 0
    "dummy_text" = "Workstation, receiving passive check results only"

    "passive_checks" = [
      "backups",
      "updates",
    ]
  }
}

apply Service for (name in host.vars.passive_checks) {
  name = name

  check_command = "dummy"
  check_interval = 7d
  max_check_attempts = 1

  vars.dummy_state = 3
  vars.dummy_text = {{
    var service = get_service(macro("$host.name$"), macro("$service.name$"))
    var lastCheck = DateTime(service.last_check).to_string()

    return "No check results received. Last result time: " + lastCheck
  }}
}

The following script can be installed on Alice’s machine and triggered by either a cron job or a systemd timer.

#!/bin/sh

# This script should run on POSIX sh, but requires curl and jq as dependencies.

# Icinga 2 API URL.
ICINGA2_API="https://example.com:5665/"
# Path to the Icinga 2 API's CA certificate to verify the connection. Other curl
# options are also possible, e.g., --insecure if you think this is a good idea.
ICINGA2_API_CURL_OPTS="--cacert ca.crt"

# The defined ApiUser and its password. If using client certificates, add them
# to ICINGA2_API_CURL_OPTS accordingly.
ICINGA2_API_USER="passive-alice"
ICINGA2_API_PASS="insecure"

# Host object for Service check results.
ICINGA2_HOST="laptop-alice"

set -eu

# process executes the given check and sends it output to the Icinga 2 API.
#
# Usage: process SERVICE_NAME CHECK_CMD [CHECK_CMD_ARGS...]
process() {
  service="$1"
  shift

  echo "$0 processing $ICINGA2_HOST!$service: $@"

  set +e
  plugin_output="$($@)"
  exit_status="$?"
  set -e

  performance_data="$(echo -n "$plugin_output" | cut -d '|' -f 2-)"
  plugin_output="$(echo -n "$plugin_output" | cut -d '|' -f 1)"

  payload_filter="host.name==\"$ICINGA2_HOST\" && service.name==\"$service\""
  payload="$(jq -n \
    --arg filter "$payload_filter" \
    --arg exit_status "$exit_status" \
    --arg plugin_output "$plugin_output" \
    --arg performance_data "$performance_data" \
    '{
      "type": "Service",
      "filter": $filter,
      "exit_status": $exit_status,
      "plugin_output": $plugin_output,
      "performance_data": $performance_data,
      "pretty": true
     }')"

  curl \
    -s -S \
    -u "${ICINGA2_API_USER}:${ICINGA2_API_PASS}" \
    -H 'Accept: application/json' -X POST $ICINGA2_API_CURL_OPTS \
    "${ICINGA2_API}v1/actions/process-check-result" \
    -d "$payload"
}

process "backups" /usr/lib64/nagios/plugins/borgbackup
process "updates" /usr/lib/nagios/plugins/check_apt

Assuming that this script runs daily, fresh backup and update check results are always being submitted. As the Services on the Icinga 2 Host only state what they are for and not how they are performed, Alice could choose which tool to use – as long as it complies with corporate policy.

In the code above, the borgbackup check from the Linuxfabrik and check_apt from the Monitoring Plugins is used for BorgBackup on a Debian system. Of course, other checks for the same purpose are possible as well. Consider the following two examples, one for ZFS snapshots on Alpine Linux and one for restic on OpenBSD.

# For ZFS Snapshots on Alpine Linux
process "backups" /usr/lib/nagios/plugins/check_zfs_snapshot \
  -w 345600 -c 604800 # 4days, 7days
process "updates" /usr/lib/nagios/plugins/check_apk
# For Restic on OpenBSD
process "backups" /usr/local/libexec/nagios/check_xs restic \
  -restic-bin /usr/local/sbin/restic-fullbackup -restic-args s3-office \
  -snapshot-age-crit 168h -snapshot-age-warn 96h # 7days, 4days
process "updates" /usr/local/libexec/nagios/check_xs openbsd_updates

To scale this for multiple employees, reuse the Icinga 2 DSL configuration from above by putting a for loop around the ApiUser and the Host. In this case, using client certificate would simplify things a lot, since the password can be replaced by a non confidential client_cn.

Outlook

Including workstations in Icinga extends the monitoring coverage and reveals potential problems before they become real problems. Since this type of monitoring involves with humans and machines, technological and interpersonal adaptions are necessary. On a closing note, try to address updates and security issues as a team effort, not as a way of scapegoating.

You May Also Like…

Icinga 2 DSL – Variable Scopes

Icinga 2 DSL – Variable Scopes

Ever wondered how Icinga 2 manages all those variables, and how it knows which one to use? In this blog post, we will...

Subscribe to our Newsletter

A monthly digest of the latest Icinga news, releases, articles and community topics.