We are using ansible to provision new servers, as well as deploy new code on those servers. We run the same playbook when provisioning a new server and deploying code (we just run it on all servers)
The whole process when deploying new code takes around 10 minutes, which my coworkers think is really slow. (obviously, on each node, it pulls down code from git, installs all the dependencies every time as well as composer, php-fpm, supervisor, virtual hosts, etc).
So that leaves me with 3 options, I think:
Keep using ansible as is. We’re sure all servers always are fully provisioned when we deploy.
Split up our ansible playbooks. No need to make sure PHP is installed every time.
Use another service like DeployHQ. Only use ansible for provisioning.
What’s your opinions on this? How are you using ansible in your organisation?
Have you profiled, https://github.com/jlafon/ansible-profile, your Ansible executions to find out where the largest bottlenecks are? I would start there. Once you know where your process is slowest and why, you can better attack the problem You might be able to speed up the process by having a base image that incorporates some of your Ansible tasks(docker or AMI for instance), especially those don’t change between projects or deploys. Or perhaps your installation and configuration tasks could use some optimizations.
For your 3 options, from a “correctness” standpoint you should stick with option 1. From a realist, get stuff out the door standpoint, you can probably make some assumptions where appropriate, but you should still exercise your full deployment process regularly to check for bugs or other problems.
I'd enable a profiler first (on Ansible 2.x just add
[defaults]
callback_whitelist = profile_tasks
to ansible.cfg in the playbook directory )
With a bit of work you can really drive down play times. a few tips
(obviously may not all apply to you).
* use RPMs / Debs over tarballs for installs, batch them up into a
single task, and _never_ use state=latest
* for git checkouts, use an explicit tag/branch/commit
* if you have to use tarball downloads, check that things like get_url
are not downloading files every run
- a few well placed creates: clauses will speed things up no end
* make sure you have ControlPersist if you're running several tasks on
a single host
At my current gig we're adding automation to a fleet of "hand-crafted,
artisanal servers", so a single site.yml
isn't really feasible because there are so many inconsistencies
between environments.
We have a (growing) site.yml that is safe to run 24/7 but a fair few
one-off-job.yml files that do smaller
chunks of components. It's surprising to me how easy it's been for
config to drift when there isn't one
playbook to rule them all.
I'm thinking of adding some rundeck jobs to run our site.yml play with
--check over the course of the day
just as an early warning system, we'll see how that goes.
All that said, smaller 'one-off' playbooks can be really useful for
things like deploys.
We have a distributed subsystem that requires parts to be restarted in
specific order across many servers,
and having rationalised the inventories and group vars for the 'big
playbook' pays off in those situations too.
For software we get from aptitude (we run Ubuntu), we run apt every time.
For others, where we do more “manual” things, we might check if a directory exists, for example, and if not, proceed with the actual install using ansible’s when: tag.
We could definitely add more checks, and that would speed up our playbooks. Are you doing that?
The ansible-profile repo has been merged into Ansible as of v2. Using callback_whitelist = profile_tasks (that Dick mentioned) in your cfg is the quickest way to get the same result, and it is part of the maintained codebase.