I use Ansible to create and configure many Cloud resources (VM, VPC, SecurityGroups, Volumes…). Compared to Terraform, Ansible does not offer an easy way to delete the created resources. So I started to create a “callback plugin” that generates a Playbook that can destroy the resources created by another Playbook. This plugin is available on GitHub and on Galaxy sites (see links below). But I’m wondering : couldn’t it be possible to include this “rollback” capability directly in the modules like “amazon.aws." or "google.coud.”, etc…
The problem is, after you’ve successfully created a security group, the afterwards failing EC2 module doesn’t know anything about the previous resources (the security group in this case) that needs to be removed/rolled back.
I like the idea of the rollback plugin and I will try it out.
But I guess it’s only suitable in isolated projects (afaiu, that’s the way the terraform people like to work).
In a larger environment with shared ressources, it can accidentally remove resources that might be still needed.
But I get your point and I’ve tried an audience for that long time ago
At work, I start building roles, that does exactly that. E.g. deploy an EC2 instance and also remove it.
So what it does is
So an instance can be deployed by some little set of required variables, such instance name and subnet name. Beside of the required once, there are optional once, like instance type, additional ebs volumes etc.
In my POV, we need high quality micro roles, that glues that functionality together.
But I’m not sure if such “micro roles” should be developed and shipped e.g. with community.aws or if it needs a new engineering culture on the ansible user side.
Yes, you’re right: I want to mimick Terraform behaviour, where each project is autonomous and as a consequence, created resources are not shared among multiple projects. This missing feature is annoying and is the reason why people learn Ansible AND terraform (or OpenTofu…).
For example, I wrote Playbooks to create K8s clusters. Each cluster has its own resources and when I want to delete a cluster, I need to be sure all the resources are deleted.
I thought that an automatic generation of the “rollback/cleaning” Playbooks would be an elegant solution (but certainly not a perfect solution )
The problem is, after you’ve successfully created a security group, the afterwards failing EC2 module doesn’t know anything about the previous resources (the security group in this case) that needs to be removed/rolled back.
And that’s the issue. Dependencies and order of removing resources
That’s why the order in my role is a little different in the destroy block.
In that simple case, you must first delete the ec2 instance and afterwards you can remove the security group.
Another option is to detach the security group from all its resources and afterwards delete the ec2 instance. But that makes the step of removing the security group more complex.
Another possibility what a callback_plugin can do: create a statefile (like terraform/tofu). You just need to collect the resource IDs and e.g. the (tag)names, to get a human-friendly name.
But the dependency/order issue exists here as well. It needs some kind of algorithm/dependency detector …
What I find most annoying is that as soon as something doesn’t work properly, they simply switch to another tool—or even worse—develop a completely new tool instead of improving the existing tool (Ansible in this case)…