Automating monitoring/alerting as part of ansible

Eric_C · June 25, 2016, 12:14am

I’m in the process of a complete overhall / rewrite of our deployment / provisioning systems using ansible for a
distributed, microservices based, highly available architecture living on the AWS cloud.

In parallel I’ve been investigating improving how our company does monitoring. Currently, our DevOps team manages an
Icinga deploy by manually updating configuration files whenever hosts are added/removed, new services are added, etc.

This has become fairly unwieldy - the new (and not yet complete) ansible project is up to 23 roles and 48 playbooks -
multiplied by horizontal scaling, staging and production, autoscaling, etc, it has become nearly impossible to remain
proactive. It is not uncommon for services to remain unmonitored until they have gone down at least once in production.

Seeing as how our ansible project knows how to configure every piece of software in our stack from the ground up, I see
potential in using ansible to automate the configuration of monitoring and alerting software as part of
deployment.

For a few days I’ve been poking around at integrating automated alerting and monitoring with sensu (compatible with our
current icinga/nagios checks, aims for easy CM automation) and haven’t found an obvious, clean way to do it (whether
I’m using sensu, icinga, nagios, etc, it shouldn’t make too much of a difference).

Good idea / bad idea? It seems logical to me, but maybe ansible isn’t best suited for this? Thoughts? Anyone tried
something like this at a similar scale (~75-100+ services spread across hundreds to a thousand hosts)?

Topic		Replies	Views
CPU Ansible Project	9	80	November 17, 2018
Ansible playbook to monitor disk space utilization and generate alert Ansible Project	4	281	July 18, 2022
Does Ansible has capability to monitor and self heal services? Ansible Project	0	26	March 29, 2018
Nagios Plugin Ansible Project	4	45	November 23, 2018
Ansible-ServiceNow integration Ansible Project	0	90	October 22, 2020

Automating monitoring/alerting as part of ansible

Related topics