Replay Log in Distributed Icinga Environments

by | Dec 16, 2021

An essential part of a distributed monitoring environment with Icinga that includes master, satellite and agent nodes is the replay log functionality.

The replay log is a built-in mechanism to ensure nodes in a distributed setup keep the same history e.g. check results, notifications and downtimes if nodes are temporarily disconnected and then reconnect. The log_duration endpoint configuration attribute tells the node for how long to keep replay logs if the endpoint is disconnected and is set to one day by default.

It is important to note here that master or satellite nodes with many agents connected will have to process a large amount of replay logs after the nodes are reloaded due to configuration synchronization or deployment. When reloading, nodes temporarily lose their connections, which starts the replay log mechanism. After the nodes are reconnected, the replay logs are processed almost simultaneously, which can lead to a heavy load.

However, this can be prevented by following the best practice approach to configuring agent nodes, which involves running checks on the agent through command_endpoint and setting the configuration attribute log_duration to zero in the agent endpoint definitions, which disables the replay log for those endpoints. Note that this must be done where the agent endpoints are configured on the master nodes, e.g.:

# /etc/icinga2/zones.conf of the master nodes

object Endpoint "master-node-a" { 
}

object Endpoint "master-node-b" {
}

object Zone "master" {
  endpoints = [ "master-node-a", "master-node-b" ]
}

object Endpoint "agent-node" {
  # Disable replay log
  log_duration = 0
}

object Zone "agent-node" {
  endpoints = [ "agent-node" ]
  parent = "master"
}

object Zone "global-templates" {
  global = true
}

object Zone "director-global" {
  global = true
}

 

If you are using Director to manage zones and agents, you do not need to do anything as log_duration = 0 is set automatically for agents that are command endpoints.

You May Also Like…

Icinga 2 API and debug console

Icinga 2 API and debug console

Have you ever experienced configuration issues, such as notifications not being sent as expected or apply rules not...

Subscribe to our Newsletter

A monthly digest of the latest Icinga news, releases, articles and community topics.