Analyse Icinga 2 problems using the console & API

Lately we’ve been investigating on a problem with the check scheduler. This resulted in check results being late and wasn’t easy to tackle – whether it’ll be the check scheduler, cluster messages or anything else. We’ve been analysing customer partner environments quite in deep and learned a lot ourselves which we like to share with you here.
One thing you can normally do is to grep the debug log and analyse the problem in deep. It is also possible to query the Icinga 2 API fetching interesting object attributes such as “last_check”, “next_check” or even “last_check_result”.
But what if you want to calculate things for better analysis e.g. fetch the number of all services in a HA cluster node where the check results are late? (more…)