Performance Metrics Collection from an Openshift Cluster

This presentation will demonstrate how to collect and aggregate performance metrics from an openshift cluster using prometheus and alertmanager. Prometheus is a monitoring system that collects and stores metrics from various sources, such as nodes, pods, containers, and services. Alertmanager is a component of prometheus that handles alerts based on predefined rules and sends them to different receivers, such as email, slack, or webhook.

The presentation will show how to configure prometheus and alertmanager to collect and group metrics according to the functional options of openshift, such as cluster health, resource utilization, network traffic, and application performance.

The presentation will also show how to integrate an event processing engine with alertmanager to receive and process the alerts, and how to map them in icinga, a monitoring tool that provides incident history reporting and notifications. The presentation will provide an overview of the benefits and challenges of using prometheus and alertmanager for performance metrics collection and incident tracking in Icinga2.

Date

June 6 | 15:30 – 16:00

Location

Mainstage

Session Type

Presentation (30 min.)

Patrick Zambelli

Team Leader Technical Consulting IT Systems & Service Management Solutions, Würth Phoenix

In 2008, I had my first contact with Nagios. After deploying the daemon in various projects, I started to learn Icinga2, a monitoring tool that I use in multiple projects for the Würth Group and outside the group. As the usecases became more complex, the search for concepts beyond the polling-based approach became relevant. I gained interesting experience while utilizing the TICK stack with Telegraf and InfluxDB for the collection of performance metrics. Event-based monitoring represented another challenge, and I lead to grow the open-source project called Tornado at Würth. With the advent of modern container-based distributions, we need to bring together all these technologies into a combination to face the requirements for this fast evolving worlds.

Michail Schabatin

IT Professional, RTL2

Michail Schabatin is a proficient IT professional with a focus on Linux and Open Source technologies. Currently employed at FA Ready Computer GmbH within RTL2 Fernsehen GmbH & Co. KG, he serves as Second Level Data Centre Support. His primary responsibilities encompass monitoring operations at RTL2, managing Linux servers, virtualization, containerization, and spearheading the modeling and renewal of business processes. Michail excels in providing comprehensive customer support, ensuring seamless operations and optimal performance within the IT infrastructure.

Get your Ticket

Register today for Icinga Summit 2024 and save your seat for our major Icinga event of the year!