Bernd’s Interview on Icinga at Floss Weekly

Floss Weekly LogoLast Wednesday Bernd gave an interview on Icinga to Randal L. Schwartz from Floss Weekly, a free software/open source theme podcast. Since 2010, Randal is the lead host of this show where he interviews every week influential OS experts from over the world. In this interview, Bernd explained the main differences between Nagios and Icinga as well as the advantages of Icinga 2 over them and its API. Randal and his co-host Guillermo Amaral, asked questions about monitoring in general with Icinga 2, integration with other tools and future prospects.
If you want to know more about it, check out the interview!

Icinga 2 API: Updates

Hi all,

icinga2_apiat this point in development it is obviously not so hard to write something about the upcoming API for Icinga 2 v2.4. We’ve been busy in the past months to design, refine and plan the development of such an API. In order to give you an insight into what’s going on and what else to expect, please lean back and grab a coffee or two.

Hint: Follow Icinga on Twitter for faster updates :-) And make sure to join Icinga Camp Portland where we have talks and demos ready for you :)

 

Design

You might have seen it already and wondered why the cluster functionality contains the ApiListener configuration object including x509 connection handling. Generally speaking, the cluster API is an internal core interface, nothing we’d like to expose to users or programmatic scripts.

We’ve also been discussing whether to use the existing JSON-RPC interface and expose that to users. While JSON-RPC is still cool, it would have been tremendously hard to add client libraries and examples. In the end it would be yet another proprietary API protocol, and we certainly want something easy but flexible for our Icinga 2 API. Looking at existing APIs and recommendations made by community members (thanks Michael Medin for believing in that) we decided to go for a REST API after some mockups and use-case analysis.

In order to define our own url schema we’ve looked into other APIs such as DigitalOcean, Foreman, etc. and created concepts and to-dos for our very own schema.

 

Purpose

The main purpose of the Icinga 2 API is to

There’s a variety of existing tools and interfaces for which the API shall act as replacement:

  • send_nsca: pass a checkresult to Icinga 2 via actions interface
  • Livestatus: status queries and sending commands
  • External command pipe: Send commands (without quirky local permission problems and/or SELinux)
  • SNMP Traps: handlers can create/modify objects at runtime and send check results
  • Perfdata/OSCP-Commands: receive check results directly as event stream
  • Inventory/Auto-Discovery: external applications create/modify objects at runtime (PuppetDB/Foreman, CMDB, AWS, etc)

Target audience:

  • (web) applications fetching data and provide their own filters and restrictions
  • admins with root permissions querying the api on their own
  • scripts which pull/push data automatically (including command restrictions)

 

Main Requirements

  • RESTful url schema
  • Basic API framework including an HTTP server
  • ApiUser config object for authentication: Basic Auth or x509 client certificate name (default will be created upon installation)
  • Authorization and simple permissions (restrict users to run specific commands for ack only e.g.)
  • HTTP handler to interpret and process requests (GET, POST, PUT, DELETE)
  • Url schema versioning, JSON as output, dashes in urls (no underscores)
  • Url paramaters including object filters and column limiting
  • Dependency tracking for object deletion (services depend on hosts, etc.)

 

Configuration Management

The main idea behind it is to allow external applications to create configuration packages and stages based on configuration files and directory trees. This replaces any additional SSH connection and whatnot to dump configuration files to Icinga 2 directly. In case you’re pushing a new configuration stage to a package, Icinga 2 will validate the configuration asynchronously and populate a status log which can be fetched in a separated request.

Example: Create the config package “puppet”:

$ curl -k -s -u root:icinga -X POST https://localhost:5665/v1/config/packages/puppet | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "package": "puppet",
            "status": "Created package."
        }
    ]
}

Add a new config file to the stage (this one has an error in it for better demo cases):

$ curl -k -s -u root:icinga -X POST -d '{ "files": { "conf.d/test.conf": "object Host \"cfg-mgmt\" { chec_command = \"dummy\" }" } }' https://localhost:5665/v1/config/stages/puppet | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "package": "puppet",
            "stage": "imagine-1441133065-1",
            "status": "Created stage."
        }
    ]
}

If the configuration fails, the old active stage will remain active. If everything is successful, the new config stage is activated and live. Older stages will still be available in order to have some sort of revision system in place.

List all config packages, their active stage and other stages. That way you may iterate of all of them programmatically for older revisions.

$ curl -k -s -u root:icinga -X GET https://localhost:5665/v1/config/packages | python -m json.tool{
    "results": [
        {
            "active-stage": "",
            "name": "aws",
            "stages": []
        },
        {
            "active-stage": "",
            "name": "puppet",
            "stages": [
                "imagine-1441133065-1"
            ]
        }
    ]
}

Now that we don’t have an active stage for “puppet” yet, there must have been an error. Fetch the “startup.log” file and check the config validation errors:

$ curl -k -s -u root:icinga -X GET https://localhost:5665/v1/config/files/puppet/imagine-1441133065-1/startup.log
...

critical/config: Error: Attribute 'chec_command' does not exist.
Location:
/var/lib/icinga2/api/packages/puppet/imagine-1441133065-1/conf.d/test.conf(1): object Host "cfg-mgmt" { chec_command = "dummy" }
                                                                                                       ^^^^^^^^^^^^^^^^^^^^^^

critical/config: 1 error

Apart from populating just the local configuration, the config file management interface also supports “zones.d” trees which will be taken into account for the well-known cluster config sync automatically.

This API feature is mainly required for the upcoming Icinga Web 2 Config Tool for Icinga 2.

 

Create Objects at Runtime

Objects can be created by sending a PUT request including all required object attributes. Icinga 2 will validate all objects and return detailed errors on failure.

Objects created by the API are persisted on disk. In the next development sprint we’ll also finish the cluster synchronization – new objects will automatically be synced amongst authorized cluster nodes, no manual configuration required.

Example: Create host “google.com” with object attributes. The required “check_command” attribute is hidden in the imported “generic-host” template.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com' \
-X PUT \
-d '{ "templates": [ "generic-host" ], "attrs": { "address": "8.8.8.8", "vars.os" : "Linux" } }' \
| python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "status": "Object was created."
        }
    ]
}

Creating new objects will trigger apply-rule evaluation automatically – host.address and host.vars.os will result in “ping4” and “ssh” services.

If the configuration validation fails, the new object will not be created and the response body contains a detailed error message. The following example omits the required check_command attribute.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com' \
-X PUT \
-d '{ "attrs": { "address": "8.8.8.8", "vars.os" : "Linux" } }' \
| python -m json.tool
{
    "results": [
        {
            "code": 500.0,
            "errors": [
                "Error: Validation failed for object 'google.com' of type 'Host'; Attribute 'check_command': Attribute must not be empty."
            ],
            "status": "Object could not be created."
        }
    ]
}

 

Modify Objects at Runtime

In case you want to modify attributes at runtime, we’ve implemented a cool internal event handler system notifying external interfaces on changes (DB IDO, cluster, etc). You are not limited to specific attributes as known from Icinga 1.x, but (nearly) everything. Changing the host’s address at runtime is not an issue for example. All modified attributes are persisted on disk and will survive a restart. These modified attributes will result in objects versions (to be implemented) throughout the cluster synchronization.

Example for existing object google.com:

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com' \
-X POST \
-d '{ "attrs": { "address": "8.8.4.4", "vars.os" : "Windows" } }' \
| python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "name": "google.com",
            "status": "Attributes updated.",
            "type": "Host"
        }
    ]
}

One thing to note – there’s also support for indexers, e.g. “vars.os” instead of declaring “vars” as JSON dictionary.

Take a different example: Lower the “retry_interval” for all hosts in a Not-UP state:

curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts?filter=host.state!=0' -X POST -d '{ "attrs": { "retry_interval": 30 } }' | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "name": "host-oob",
            "status": "Attributes updated.",
            "type": "Host"
        },
        {
            "code": 200.0,
            "name": "google.com",
            "status": "Attributes updated.",
            "type": "Host"
        }
    ]
}

 

Delete Objects at Runtime

In case of deleting objects, it’s a bit trickier: What happens if you delete the host object having several services depending on it? In the past, the host would have been deleted and the services would remain an inconsistent state. The solution to that sounds simple – track the object dependencies and only allow to delete such dependency chains if the user says so (cascading delete). If not, the DELETE request will return an error. You may also only delete objects created by the API – that’s for safety reasons preventing unwanted mixes of static configuration, config management and runtime config changes.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com?cascade=1' -X DELETE | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "name": "google.com",
            "status": "Object was deleted.",
            "type": "Host"
        }
    ]
}

Note: Apply Rules must be statically configured or passed through the config management API. Newly created objects will automatically trigger apply rule evaluation (e.g. host with address automatically gets the “ping4” check assigned if that apply rule is in place).

 

Status Queries

While Livestatus and DB IDO do not expose all object attributes, the Icinga 2 API allows you to fetch all object types and their runtime configuration and state attributes. Apart from accessing a single object you may also use the same filter expressions known from apply rules to fetch a filtered list of objects.

You can select specific attributes by adding them as url parameters using ?attrs=…. Multiple attributes must be added one by one, e.g. ?attrs=host.address&attrs=host.name.

$ curl -u root:icinga -k -s 'https://localhost:5665/v1/objects/hosts/google.com?attrs=host.name&attrs=host.address' -X GET | python -m json.tool
{
    "results": [
        {
            "attrs": {
                "host.address": "8.8.8.8",
                "host.name": "google.com"
            }
        }
    ]
}

 

Another cool thing – the check results also contain the executed command. That’s something pretty helpful for testing your configuration. Or – check the group membership of a host. Modify the attributes at runtime, and retrieve their status again.

icinga2_api_status_query_host_01 icinga2_api_status_query_host_02 icinga2_api_status_query_service_checkresult

Hint: If you want to view JSON in your browser, look for apps like for Chrome: JsonView.

Finishing this task is scheduled for the next weeks, some details are still missing.

 

Actions

Actions provide well-known runtime commands where you’ll schedule downtimes, acknowledge problems, add comments, etc. By using the same filter expression as found in the config language, you’ll have lots of possibilities to trigger actions. Futhermore all passed attributes are easily identified by their name. Forget about Icinga 1.x or Nagios using “SCHEDULE_HOST_DOWNTIME;host1;1110741500;1110748700;1;0;7200;foo;comment”!

Example: Reschedule check for host “google.com” using a filter.

curl -u root:icinga -k -s 'https://localhost:5665/v1/actions/reschedule-check?type=Host&filter=host.name==%22google.com%22' -X POST  | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "status": "Successfully rescheduled check for google.com."
        }
    ]
}

Example: Acknowledge all service problems at once.

curl -u root:icinga -k -s 'https://localhost:5665/v1/actions/acknowledge-problem?type=Service&filter=service.state!=0' -d '{ "author": "michi", "comment": "Mega outage. Will take care." }' -X POST  | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "status": "Successfully acknowledged problem for host-oob!service-oob"
        },
...
        {
            "code": 200.0,
            "status": "Successfully acknowledged problem for google.com!ssh"
        }
    ]
}

 

One more: Schedule a downtime for all hosts having the custom attribute “vars.os” set to “Linux”, e.g. for a general Puppet run rebooting the boxes on kernel updates.

curl -u root:icinga -k -s 'https://localhost:5665/v1/actions/schedule-downtime?type=Host&filter=host.vars.os==%22Linux%22' -d '{ "author" : "michi", "comment": "Maintenance.", "start_time": 1441136260, "end_time": 1441137260, "duration": 1000 }' -X POST | python -m json.tool
{
    "results": [
        {
            "code": 200.0,
            "downtime_id": "imagine-1441136548-1",
            "legacy_id": 11.0,
            "status": "Successfully scheduled downtime with id 11 for object google.com."
        },

...

        {
            "code": 200.0,
            "downtime_id": "imagine-1441136548-12",
            "legacy_id": 22.0,
            "status": "Successfully scheduled downtime with id 22 for object imagine.Speedport_W_921V_1_36_0009."
        }
    ]
}

Event Streams

Register clients listening on event streams and filter these events, e.g. only receive not-ok states. The following example is from our concept phase to give you an idea:

Request:

$ curl -k -s -u root:icinga -X POST 'https://localhost:5665/v1/events?queue=michi&types=CheckResult&filter=event.check_result.exit_status==2'

{"check_result":{ ... },"host":"www.icinga.org","service":"ping4","timestamp":1445421319.7226390839,"type":"CheckResult"}
{"check_result":{ ... },"host":"www.icinga.org","service":"ping4","timestamp":1445421324.7226390839,"type":"CheckResult"}
{"check_result":{ ... },"host":"www.icinga.org","service":"ping4","timestamp":1445421329.7226390839,"type":"CheckResult"}

Note: This is not implemented yet. Development sprint is scheduled for CW42.

Btw – ohcp and oscp commands should be fairly replaceable by event stream clients forwarding all events to your umbrella monitoring system.

 

Reflection

List all url endpoints (objects, types, attributes) including details. Take an example: The Icinga 2 types follow an hierarchical order: Host inherits from Checkable inherits from  CustomVarObject inherits from ConfigObject inherits from Object. Using that information including all the object attributes you’ll get:

  • all object attributes
  • all object type prototypes (e.g. Object#clone)

icinga2_api_reflection_type_host icinga2_api_reflection_type_dictionary icinga2_api_reflection_type_apiuser

 

(HTTP) Clients

Ok, there’s curl and alternatives on the shell. We’ll also work on the icinga2 console providing an HTTP client to directly connect to the Icinga 2 API.

But yet there’s another cool thing: Icinga Studio. It connects to the Icinga 2 API and provides a type hierarchy including all objects and their runtime configuration and state. Built with wxWidgets making it cross-plattform (Linux, Windows, MacOSX). We’ll prepare packages in the next weeks for that as well (only where wxWidgets is available). For now it helps debugging and testing, at some later point we may consider changing its read-only state allowing runtime modifications :-)

icinga2_api_icinga_studio_01 icinga2_api_icinga_studio_02 icinga2_api_icinga_studio_03

 

Future

We’ve discussed, designed, re-evaluated and (pair) programmed quite a lot in the past weeks. Our goal is to have 2.4 ready right before OSMC later this year in November where you’ll get the whole package.

We’ll have the latest and greatest Icinga 2 API snapshot with us at Icinga Camp Portland right after PuppetConf – join us for live demos, talks, feedback & some G&T of course :)

In case you’re an addon developer, or want to start playing, our documentation is not complete yet, but will be frequently updated in the next weeks. Our Vagrant boxes use the latest and greatest snapshot packages too! :-)

Cheers from the Icinga 2 Core development team,

Michael, Gunnar & Jean-Marcel

Farewell Icinga API

In the days leading up to the v1.5 release, we bid our Icinga API goodbye and usher in a new API and Web concept.

You may ask yourself, what was this API anyway? Indeed, if you weren’t developing or adapting extensions for the new web interface, you wouldn’t have had much contact with this important project component. When Icinga was conceived, one of the main missions was to facilitate the development of addons and plugins. The API provided a set of commonly used request operations, removing the need to write sql-queries and generally a lot of excess code.

All was well till we decided to offer some extra database flexibility. When we added support for Oracle and PostgreSQL on top of MySQL, we also gave our Icinga API team some extra work. With each change, bug fix or new feature, Marius, Michael L and I had to edit the queries for each database back-end separately. This process was not only complicated and error-prone, but also a sign that we needed a more flexible architecture.

As of Icinga 1.5, the external Icinga API will be replaced by an internal database layer Doctrine, and merged into Icinga Web. Much like before, queries will run through this layer between the database (IDOUtils) and the web interface. However, with Doctrine we can use several database back-ends and querying the database is now much easier. In contrast to SQL, its object relational mapper (ORM) uses Doctrine Query Language, so we now have the flexibility minus the code duplication.

Icinga's new architecture - goodbye Icinga API, hello Doctrine

That being said, queries from the old API still exist, thanks to the ‘legacy layer’ which will transform old API queries into this new ORM type. In this way, we maintain compatibility with addons designed for older Icinga versions. The Rest API is also still there as part of Icinga Web, extending on our Doctrine layer with HTTP for addons that require only certain bits of monitoring info.

With the departure of a standalone API, the average Icinga user will barely notice a change, apart from the fact that the configuration has now been moved to the databases.xml. Best of all, every module developer can now easily access the Icinga database without much code overhead– so addon developers get hacking and let us know how you go!

For more information see our Wiki:
Development Guide for Icinga Web
Icinga Database Essentials
Icinga Web REST API

Icinga development visualized by Gource Revisited

Last year, I found gource for visualization of Icinga’s GIT repositories. One of my fellow twitter followers (@crsp) was so passionate about new Icinga Gource episodes – so here they are … including a short Icinga Mobile intro :-)

From last year’s end til now – it’s huge what happened there… Stay tuned for upcoming releases and enjoy the show! =)

Icinga Core

Icinga Doc

Icinga API

Icinga Web

Icinga Mobile

Icinga 1.2 unified stable released!

Live from OSMC in Nuremberg – we did it! Icinga 1.2 hits the next level of opensource monitoring!

Bringing you the latest Icinga release – unified version 1.2 for core, classic ui, idoutils, docs, api and web :-)

New Icinga-Web features long awaited PNP4Nagios Integration (have a look in etc/contrib/) and missing comments integrated into hosts and services view. Next to that we’ve added PENDING states to reflect checks not yet executed while the live search for objects performs excellent. Having trouble with upgrading and overwritten? This is now history, place your custom configuration into site xml files which won’t get overwritten! We have also added new translations and a fresh icon set for your pleasure.

Icinga Classic UI is not dead – we’ve added more great features and enhancements: multiple delete for downtimes and comments is now available. Ever wanted CSV Output on all CGIs, not parsing HTML Output? We just did it, watch out for “Export to CSV” within the Classic UI! Thanks to Jochen Bern, the config command expander is now integrated and will help you in “translating” check_command and command_line to the shell command Icinga core will attempt to execute. Comments are now tooltips on status.cgi – this nice enhancement has also been added to beautified cmd.cgi. The menu has been reworked a bit, adding unhandled problems again. Next to that, several fixes for Solaris segfaults and the complete drop of php as a dependency for the Classic UI.

Icinga Core does not skip hostchecks anymore, if servicechecks disabled. Furthermore, scheduled downtime notifications are not sent anymore on restart/reload. An eventhandler override has been added to in order to support mod_gearman. The core dumps on Solaris and gcc3 are all resolved! Meanwhile, Icinga IDOUtils had a significant forking problem when the database connection was not available. Debugged and fixed. Hooray!

Last, but not least – Icinga documentation in English and German features a lot of changes invoked through ongoing development, but also several new sections as mentioned for Icinga Web Introduction, Icinga Web REST API or even complete docs on external commands for Icinga Core.

A new reporting package for Jasper has been in the works – we’ve added docs and a first package while shipping Icinga 1.2!

We hope you enjoy Icinga 1.2 and love it as we do =)

Feedback, feature requests or bug reports much appreciated!