We’ve come across a faulty patch in 2.4.2 which attempted to fix the retry interval in the first NOT-OK soft state. In the end it turned out to cause troubles inside HA clusters as well as with passive check results. The incorrect patch has been reverted and a proper fix is currently investigated but not included in this stable bugfix release yet.
This release also fixes problems with volatile, flapping and recovery notifications. We’ve tackled a problem with multiple directory levels on Windows in combination with the cluster config sync. Last but not least the ITL provided http and postgres check commands got several new options.
When upgrading a cluster environment please always ensure that your master and/or satellites are updated first and then all clients. While we are trying to ensure that minor versions can co-exist there were changes in 2.4.2 for example that required an update on the master and the clients. Your best bet is to keep the versions the same on all nodes.
One thing we’ve recognised on the community support channels and our bug tracker – Icinga 2 depends on the time being in sync like any other distributed application. If you haven’t done so already please enable ntp and monitor possible time differences using the check_ntp_time check plugin. Such differences may cause issues with check updates and must not harm your monitoring environments.
Updated packages for all distributions should be available soon. We’ve also updated the package repositories to support SLES 11 SP4 and SLES 12 SP1.
What’s New in Version 2.4.4
Feature
- Feature 10358: ITL: Allow to enforce specific SSL versions using the http check command
- Feature 11205: Add “query” option to check_postgres command.
Bugfixes
- Bug 9642: Flapping notifications are sent for hosts/services which are in a downtime
- Bug 9969: Problem notifications while Flapping is active
- Bug 10225: Host notification type is PROBLEM but should be RECOVERY
- Bug 10231: MkDirP not working on Windows
- Bug 10766: DB IDO: User notification type filters are incorrect
- Bug 10770: Status code 200 even if an object could not be deleted.
- Bug 10795: http check’s URI is really just Path
- Bug 10976: Explain how to join hosts/services for /v1/objects/comments
- Bug 11107: ITL: Missing documentation for nwc_health “mode” parameter
- Bug 11159: Common name in node wizard isn’t case sensitive
- Bug 11208: CMake does not find MySQL libraries on Windows
- Bug 11209: Wrong log message for trusted cert in node setup command
- Bug 11240: DEL_DOWNTIME_BY_HOST_NAME does not accept optional arguments
- Bug 11248: Active checks are executed even though passive results are submitted
- Bug 11257: Incorrect check interval when passive check results are used
- Bug 11273: Services status updated multiple times within check_interval even though no retry was triggered
- Bug 11289: epoll_ctl might cause oops on Ubuntu trusty
- Bug 11320: Volatile transitions from HARD NOT-OK->NOT-OK do not trigger notifications
- Bug 11328: Typo in API docs
- Bug 11331: Update build requirements for SLES 11 SP4
- Bug 11349: ‘icinga2 feature list’ fails when all features are disabled
- Bug 11350: Docs: Add API examples for creating services and check commands
- Bug 11352: Segmentation fault during ‘icinga2 daemon -C’
- Bug 11369: Chocolatey package is missing uninstall function
- Bug 11385: Update development docs to use ‘thread apply all bt full’