Icinga 2 API: Updates

Hi all,
icinga2_apiat this point in development it is obviously not so hard to write something about the upcoming API for Icinga 2 v2.4. We’ve been busy in the past months to design, refine and plan the development of such an API. In order to give you an insight into what’s going on and what else to expect, please lean back and grab a coffee or two.
Hint: Follow Icinga on Twitter for faster updates 🙂 And make sure to join Icinga Camp Portland where we have talks and demos ready for you 🙂
 

Design

You might have seen it already and wondered why the cluster functionality contains the ApiListener configuration object including x509 connection handling. Generally speaking, the cluster API is an internal core interface, nothing we’d like to expose to users or programmatic scripts.
We’ve also been discussing whether to use the existing JSON-RPC interface and expose that to users. While JSON-RPC is still cool, it would have been tremendously hard to add client libraries and examples. In the end it would be yet another proprietary API protocol, and we certainly want something easy but flexible for our Icinga 2 API. Looking at existing APIs and recommendations made by community members (thanks Michael Medin for believing in that) we decided to go for a REST API after some mockups and use-case analysis.
In order to define our own url schema we’ve looked into other APIs such as DigitalOcean, Foreman, etc. and created concepts and to-dos for our very own schema.
 

Purpose

The main purpose of the Icinga 2 API is to

There’s a variety of existing tools and interfaces for which the API shall act as replacement:

  • send_nsca: pass a checkresult to Icinga 2 via actions interface
  • Livestatus: status queries and sending commands
  • External command pipe: Send commands (without quirky local permission problems and/or SELinux)
  • SNMP Traps: handlers can create/modify objects at runtime and send check results
  • Perfdata/OSCP-Commands: receive check results directly as event stream
  • Inventory/Auto-Discovery: external applications create/modify objects at runtime (PuppetDB/Foreman, CMDB, AWS, etc)

Target audience:

  • (web) applications fetching data and provide their own filters and restrictions
  • admins with root permissions querying the api on their own
  • scripts which pull/push data automatically (including command restrictions)

 

Main Requirements

  • RESTful url schema
  • Basic API framework including an HTTP server
  • ApiUser config object for authentication: Basic Auth or x509 client certificate name (default will be created upon installation)
  • Authorization and simple permissions (restrict users to run specific commands for ack only e.g.)
  • HTTP handler to interpret and process requests (GET, POST, PUT, DELETE)
  • Url schema versioning, JSON as output, dashes in urls (no underscores)
  • Url paramaters including object filters and column limiting
  • Dependency tracking for object deletion (services depend on hosts, etc.)

 

Configuration Management

The main idea behind it is to allow external applications to create configuration packages and stages based on configuration files and directory trees. This replaces any additional SSH connection and whatnot to dump configuration files to Icinga 2 directly. In case you’re pushing a new configuration stage to a package, Icinga 2 will validate the configuration asynchronously and populate a status log which can be fetched in a separated request.
Example: Create the config package “puppet”:

$ curl -k -s -u root:icinga -X POST https://localhost:5665/v1/config/packages/puppet | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "package": "puppet",
            "status": "Created package."
        }
    ]
}

Add a new config file to the stage (this one has an error in it for better demo cases):

$ curl -k -s -u root:icinga -X POST -d '{ "files": { "conf.d/test.conf": "object Host \"cfg-mgmt\" { chec_command = \"dummy\" }" } }' https://localhost:5665/v1/config/stages/puppet | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "package": "puppet",
            "stage": "imagine-1441133065-1",
            "status": "Created stage."
        }
    ]
}

If the configuration fails, the old active stage will remain active. If everything is successful, the new config stage is activated and live. Older stages will still be available in order to have some sort of revision system in place.
List all config packages, their active stage and other stages. That way you may iterate of all of them programmatically for older revisions.

$ curl -k -s -u root:icinga -X GET https://localhost:5665/v1/config/packages | python -m json.tool{
    "results": [
        {
            "active-stage": "",
            "name": "aws",
            "stages": []
        },
        {
            "active-stage": "",
            "name": "puppet",
            "stages": [
                "imagine-1441133065-1"
            ]
        }
    ]
}

Now that we don’t have an active stage for “puppet” yet, there must have been an error. Fetch the “startup.log” file and check the config validation errors:

$ curl -k -s -u root:icinga -X GET https://localhost:5665/v1/config/files/puppet/imagine-1441133065-1/startup.log
...
critical/config: Error: Attribute 'chec_command' does not exist.
Location:
/var/lib/icinga2/api/packages/puppet/imagine-1441133065-1/conf.d/test.conf(1): object Host "cfg-mgmt" { chec_command = "dummy" }
                                                                                                       ^^^^^^^^^^^^^^^^^^^^^^
critical/config: 1 error

Apart from populating just the local configuration, the config file management interface also supports “zones.d” trees which will be taken into account for the well-known cluster config sync automatically.
This API feature is mainly required for the upcoming Icinga Web 2 Config Tool for Icinga 2.
 

Create Objects at Runtime

Objects can be created by sending a PUT request including all required object attributes. Icinga 2 will validate all objects and return detailed errors on failure.
Objects created by the API are persisted on disk. In the next development sprint we’ll also finish the cluster synchronization – new objects will automatically be synced amongst authorized cluster nodes, no manual configuration required.
Example: Create host “google.com” with object attributes. The required “check_command” attribute is hidden in the imported “generic-host” template.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com' \
-X PUT \
-d '{ "templates": [ "generic-host" ], "attrs": { "address": "8.8.8.8", "vars.os" : "Linux" } }' \
| python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "status": "Object was created."
        }
    ]
}

Creating new objects will trigger apply-rule evaluation automatically – host.address and host.vars.os will result in “ping4” and “ssh” services.
If the configuration validation fails, the new object will not be created and the response body contains a detailed error message. The following example omits the required check_command attribute.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com' \
-X PUT \
-d '{ "attrs": { "address": "8.8.8.8", "vars.os" : "Linux" } }' \
| python -m json.tool
{
    "results": [
        {
            "code": 500.0,
            "errors": [
                "Error: Validation failed for object 'google.com' of type 'Host'; Attribute 'check_command': Attribute must not be empty."
            ],
            "status": "Object could not be created."
        }
    ]
}

 

Modify Objects at Runtime

In case you want to modify attributes at runtime, we’ve implemented a cool internal event handler system notifying external interfaces on changes (DB IDO, cluster, etc). You are not limited to specific attributes as known from Icinga 1.x, but (nearly) everything. Changing the host’s address at runtime is not an issue for example. All modified attributes are persisted on disk and will survive a restart. These modified attributes will result in objects versions (to be implemented) throughout the cluster synchronization.
Example for existing object google.com:

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com' \
-X POST \
-d '{ "attrs": { "address": "8.8.4.4", "vars.os" : "Windows" } }' \
| python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "name": "google.com",
            "status": "Attributes updated.",
            "type": "Host"
        }
    ]
}

One thing to note – there’s also support for indexers, e.g. “vars.os” instead of declaring “vars” as JSON dictionary.
Take a different example: Lower the “retry_interval” for all hosts in a Not-UP state:

curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts?filter=host.state!=0' -X POST -d '{ "attrs": { "retry_interval": 30 } }' | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "name": "host-oob",
            "status": "Attributes updated.",
            "type": "Host"
        },
        {
            "code": 200.0,
            "name": "google.com",
            "status": "Attributes updated.",
            "type": "Host"
        }
    ]
}

 

Delete Objects at Runtime

In case of deleting objects, it’s a bit trickier: What happens if you delete the host object having several services depending on it? In the past, the host would have been deleted and the services would remain an inconsistent state. The solution to that sounds simple – track the object dependencies and only allow to delete such dependency chains if the user says so (cascading delete). If not, the DELETE request will return an error. You may also only delete objects created by the API – that’s for safety reasons preventing unwanted mixes of static configuration, config management and runtime config changes.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com?cascade=1' -X DELETE | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "name": "google.com",
            "status": "Object was deleted.",
            "type": "Host"
        }
    ]
}

Note: Apply Rules must be statically configured or passed through the config management API. Newly created objects will automatically trigger apply rule evaluation (e.g. host with address automatically gets the “ping4” check assigned if that apply rule is in place).
 

Status Queries

While Livestatus and DB IDO do not expose all object attributes, the Icinga 2 API allows you to fetch all object types and their runtime configuration and state attributes. Apart from accessing a single object you may also use the same filter expressions known from apply rules to fetch a filtered list of objects.
You can select specific attributes by adding them as url parameters using ?attrs=…. Multiple attributes must be added one by one, e.g. ?attrs=host.address&attrs=host.name.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com?attrs=host.name&attrs=host.address' -X GET | python -m json.tool
{
    "results": [
        {
            "attrs": {
                "host.address": "8.8.8.8",
                "host.name": "google.com"
            }
        }
    ]
}

 
Another cool thing – the check results also contain the executed command. That’s something pretty helpful for testing your configuration. Or – check the group membership of a host. Modify the attributes at runtime, and retrieve their status again.
icinga2_api_status_query_host_01 icinga2_api_status_query_host_02 icinga2_api_status_query_service_checkresult
Hint: If you want to view JSON in your browser, look for apps like for Chrome: JsonView.
Finishing this task is scheduled for the next weeks, some details are still missing.
 

Actions

Actions provide well-known runtime commands where you’ll schedule downtimes, acknowledge problems, add comments, etc. By using the same filter expression as found in the config language, you’ll have lots of possibilities to trigger actions. Futhermore all passed attributes are easily identified by their name. Forget about Icinga 1.x or Nagios using “SCHEDULE_HOST_DOWNTIME;host1;1110741500;1110748700;1;0;7200;foo;comment”!
Example: Reschedule check for host “google.com” using a filter.

curl -u root:icinga -k -s 'https://localhost:5665/v1/actions/reschedule-check?type=Host&filter=host.name==%22google.com%22' -X POST  | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "status": "Successfully rescheduled check for google.com."
        }
    ]
}

Example: Acknowledge all service problems at once.

curl -u root:icinga -k -s 'https://localhost:5665/v1/actions/acknowledge-problem?type=Service&filter=service.state!=0' -d '{ "author": "michi", "comment": "Mega outage. Will take care." }' -X POST  | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "status": "Successfully acknowledged problem for host-oob!service-oob"
        },
...
        {
            "code": 200.0,
            "status": "Successfully acknowledged problem for google.com!ssh"
        }
    ]
}

 
One more: Schedule a downtime for all hosts having the custom attribute “vars.os” set to “Linux”, e.g. for a general Puppet run rebooting the boxes on kernel updates.

curl -u root:icinga -k -s 'https://localhost:5665/v1/actions/schedule-downtime?type=Host&filter=host.vars.os==%22Linux%22' -d '{ "author" : "michi", "comment": "Maintenance.", "start_time": 1441136260, "end_time": 1441137260, "duration": 1000 }' -X POST | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "downtime_id": "imagine-1441136548-1",
            "legacy_id": 11.0,
            "status": "Successfully scheduled downtime with id 11 for object google.com."
        },
...
        {
            "code": 200.0,
            "downtime_id": "imagine-1441136548-12",
            "legacy_id": 22.0,
            "status": "Successfully scheduled downtime with id 22 for object imagine.Speedport_W_921V_1_36_0009."
        }
    ]
}

Event Streams

Register clients listening on event streams and filter these events, e.g. only receive not-ok states. The following example is from our concept phase to give you an idea:
Request:

$ curl -k -s -u root:icinga -X POST 'https://localhost:5665/v1/events?queue=michi&types=CheckResult&filter=event.check_result.exit_status==2'
{"check_result":{ ... },"host":"www.icinga.org","service":"ping4","timestamp":1445421319.7226390839,"type":"CheckResult"}
{"check_result":{ ... },"host":"www.icinga.org","service":"ping4","timestamp":1445421324.7226390839,"type":"CheckResult"}
{"check_result":{ ... },"host":"www.icinga.org","service":"ping4","timestamp":1445421329.7226390839,"type":"CheckResult"}

Note: This is not implemented yet. Development sprint is scheduled for CW42.
Btw – ohcp and oscp commands should be fairly replaceable by event stream clients forwarding all events to your umbrella monitoring system.
 

Reflection

List all url endpoints (objects, types, attributes) including details. Take an example: The Icinga 2 types follow an hierarchical order: Host inherits from Checkable inherits from  CustomVarObject inherits from ConfigObject inherits from Object. Using that information including all the object attributes you’ll get:

  • all object attributes
  • all object type prototypes (e.g. Object#clone)

icinga2_api_reflection_type_host icinga2_api_reflection_type_dictionary icinga2_api_reflection_type_apiuser
 

(HTTP) Clients

Ok, there’s curl and alternatives on the shell. We’ll also work on the icinga2 console providing an HTTP client to directly connect to the Icinga 2 API.
But yet there’s another cool thing: Icinga Studio. It connects to the Icinga 2 API and provides a type hierarchy including all objects and their runtime configuration and state. Built with wxWidgets making it cross-plattform (Linux, Windows, MacOSX). We’ll prepare packages in the next weeks for that as well (only where wxWidgets is available). For now it helps debugging and testing, at some later point we may consider changing its read-only state allowing runtime modifications 🙂
icinga2_api_icinga_studio_01 icinga2_api_icinga_studio_02 icinga2_api_icinga_studio_03
 

Future

We’ve discussed, designed, re-evaluated and (pair) programmed quite a lot in the past weeks. Our goal is to have 2.4 ready right before OSMC later this year in November where you’ll get the whole package.
We’ll have the latest and greatest Icinga 2 API snapshot with us at Icinga Camp Portland right after PuppetConf – join us for live demos, talks, feedback & some G&T of course 🙂
In case you’re an addon developer, or want to start playing, our documentation is not complete yet, but will be frequently updated in the next weeks. Our Vagrant boxes use the latest and greatest snapshot packages too! 🙂
Cheers from the Icinga 2 Core development team,
Michael, Gunnar & Jean-Marcel

Icinga vs Nagios – a developer's comparison

It’s been nearly 2.5 years/906 days  full of enhancements and refreshing development. Many things happen(ed) in the background which are not visible to everyone. Especially when it comes to comparing Icinga with its predecessor Nagios, it’s always hard to show Icinga in its best light and avoid bias at the same time.
As you might be well aware, I lead the development of Icinga Core and its related sub-projects. So, I am very active in all spaces, seeking new ideas but also patches from the community – in many worlds, not only Icinga, but Nagios, Opsview, OP5, Shinken and many other projects with similar origins in Nagios.
Once in a while people ask for a different kind of comparison. Not a fancy feature comparison designed for managers, but one which takes lost patches from the Nagios developer lists, from Nagios Portal and other Nagios community sources into account and tells them exactly how Icinga is different – from a core developer’s point of view.
You may also be interested to hear another side to the story – patches actually developed in the Icinga space, which have been backported to Nagios. Because we want to give something back to the Nagios community, work side-by-side and share knowledge.
And to stay fair – you can view yet another table which lists work done by Nagios developers that we have ported into Icinga. On a personal note – it’s always a pleasure reworking patches from Andreas Ericsson. Learned a lot in the past year on actual core development 🙂
Last but not least, we are trying to add more noticeable configs and also configure options to make life easier for packagers. You are welcome slip into that part of the project too- packagers are always (Ubuntu and Fedora especially) needed!
So here it is, the bug and feature comparison table is divided into following sections:

  • Core
  • Classic UI
  • *DOUtils
  • Docs
  • Configure Options and Configs
  • Backported to Nagios
  • Ported from Nagios and variants

URLs to both the worlds of Nagios and Icinga have been added where available, so you may take a deeper look into the details…

Revisited: Icinga Classic UI

You might be wondering, why Icinga has two web guis available for install

  • Icinga Classic UI (in icinga tarball)
  • Icinga Web (in icinga-web tarball)

Icinga Classic UI combines the Icinga CGIs using the old data storage format, based on HTML and CGI while the new Icinga Web introduces a shiny web 2.0 framework based web interface, using Icinga IDOUtils as data source. The overall question would be – why focus on 2 guis?
The answer is rather simple – many members of the core team (and others) still have their existing setups – large environments using the Classic UI and introducing something new isn’t always possible. Even more, taste is different. And of course, Icinga Classic UI still provides a local fallback, if (remote) Icinga Web might cause troubles. Last but not least, there are Icinga addons using the CGIs, and we love to help developers using alternative methods than HTML parsers. This is why we are actively pushing development ressources into Icinga Classic UI whilst working on Icinga Web, sharing fresh ideas amongst each other.
You might have followed the overall history on the Classic UI enhancements we already introduced: adding support for display_name attribute, multiple command sending for hosts/services, CSV export for all CGIs, add address6 / IPv6 support, and many more.
For Icinga 1.4 we had a bunch of long awaited and also newly introduced features on our roadmap:

  • Searching in the Icinga Logfile through the Webinterface, introducing new filters, rewritten code, enabling filtering on historical data / reporting for future development (#516)
  • Store cmd.cgi submissions in log – initially implemented, and if enabled via cgi.cfg it will catch who did send which command (#1161)
  • Enforce a need for comment for action taken in cmd.cgi – can be used in combination with new cmd.cgi logging (#610)
  • Add config option to set start of week (sunday/monday) – for trends and reports (#1269)
  • Display host/service dependencies in host/service details in extinfo.cgi – might become handy (#1300)
  • Allow searching for host display_name normal and via regexp – completing display_name support (#1393)
  • Add JSON output to cgis – will become handy for addon devs, like CSV export (#1217)
  • Replace top.html with alternative CGI driven view – Thanks to Matthew Brooks, Icinga 1.4 will get a new top frame, showing the status information like Icinga Web does (#1406)

We hope you like the latest changes – stay tuned for Icinga 1.4 on 11.5.2011 including Icinga Classic UI 🙂
Update 6.5.2011: Matthew just provided an enhanced version of the status header (added image below). This will show the counts of unacknowledged active/passive and acknowledged states including title hover. Even more, the background color will change. Let us know what you think 🙂
 


Update 6.5.2011:

Icinga moves from GIT to CVS

We finally made our decision to step away from GIT, and get back to the origin SCM used by Nagios for a long time – CVS. It will enhance our current development capabilities and allow us to develop even faster with speed of light.
The first import is still in progress, you can follow it over here on the cvsweb of sourceforge. Stay tuned …
 
Update 2.4.2011 02:25: We are failing to import GIT into CVS, so we’ll stay on GIT ;-))

Icinga 1.3.1 released

Thanks to everyone using 1.3.0 and pushing feedback onto our development tracker! While working on the upcoming 1.4 branch, we decided to backport all recent bugfixes into the 1.3 tree and release Icinga 1.3.1 🙂
Core & IDOUtils

  • fix flexible downtime on service hard state changed doesn’t get triggered
  • fix display_name survive reconfiguration and is used instead of host_name in classic ui
  • fix rdbms reconnect after connection error, add more hints for oracle to ido2db.cfg

 
Classic UI

  • fix csv export link to make it XSS save (IE)
  • fix XSS vulnerability in statusmap.cgi
  • fix tooltips in status.cgi, not showing message with carriage return
  • fix refresh on non-refreshable cgis, don’t show pause/continue
  • fix segfaults if no default_user_name= given in cgi.cfg

 
Web & API

  • fix Oracle full SID support in oci8
  • fix missing commands: remove acknowledgements
  • add missing principal creation on user import via external auth
  • fix schema updates not copied to icinga-web installation
  • fix icinga-web spec file does not perform %pre functions
  • fix wrong filter in openproblems, fixed statusmap in portal
  • fix crashes when switching tabs quickly

 
Download Icinga 1.3.1 from Sourceforge – updated packages should be available soon for various distributions. Make sure you’ll upgrade your IDOUtils DB!
Please report any bugs/feature requests to our development tracker and make sure you’ll checkout the Icinga Community Wiki for advanced topics =)