Next Generation Monitoring: Tier 2 Experience



Fay, R, Bland, J and Jones, S
(2017) Next Generation Monitoring: Tier 2 Experience. .

Access the full-text of this item by clicking on the Open Access link.

Abstract

Monitoring IT infrastructure is essential for maximizing availability and minimizing disruption by detecting failures and developing issues. The HEP group at Liverpool have recently updated our monitoring infrastructure with the goal of increasing coverage, improving visualization capabilities, and streamlining configuration and maintenance. Here we present a summary of Liverpool's experience, the monitoring infrastructure, and the tools used to build it. In brief, system checks are configured in Puppet using Hiera, and managed by Sensu, replacing Nagios. Centralised logging is managed with Elasticsearch, together with Logstash and Filebeat. Kibana provides an interface for interactive analysis, including visualization and dashboards. Metric collection is also configured in Puppet, managed by collectd and stored in Graphite, with Grafana providing a visualization and dashboard tool. The Uchiwa dashboard for Sensu provides a web interface for viewing infrastructure status. Alert capabilities are provided via external handlers. A custom alert handler is in development to provide an easily configurable, extensible and maintainable alert facility.

Item Type: Conference or Workshop Item (Unspecified)
Depositing User: Symplectic Admin
Date Deposited: 22 Feb 2019 14:08
Last Modified: 19 Jan 2023 01:02
DOI: 10.1088/1742-6596/898/9/092035
Open Access URL: https://doi.org/10.1088/1742-6596/898/9/092035
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3033272