I am working on a scenario where the first playbook executes commands on a remote host to create a vagrant host and spins up multiple vms. Vagrant can triggers it’s own ansible provisioning runs but they are only single host aware and run when the host is provisioned. That does not work in this case, as I need all VM’s running BEFORE the deployment playbook can be triggered. Added wrinkle is the VMs are accessible at this time from outside the vagrant host. If they were, I could simply import the vagrant host list into the controller inventory and refresh.
Right now I am looking at the possibility of using ansible.builtin.shell to trigger the new ansible-playbook command on the vagrant host to run the vagrant VM application configuration. But while this works it is not exactly ansible clean. Suggestions on approaches?
# Only execute once the Ansible provisioner,
# when all the machines are up and ready.
So you would spin up all your Vagrant boxes from your control node, wait for that to complete, template out a static inventory of your Vagrant boxes then run your subsequent Vagrant Ansible provisioner automation?
Already tried it and it does not work, which was why I explicitly referenced that behavior as not working as not working in this scenario.While vagrant can run playbooks at provisioning time. it does not really proivde a way to to control when the provisioin runs. All 3 hosts need to be up be for the first host can be provisioned since it requires the ips of the later hosts. Second option does not work, as the remote control node does not have access to the VMs, as mentioned. Which is what lead to the need to trigger a second playbook. otherwise could lust load the vagrant generated inventory with add_host module.
IC ould do some ugly sequencing of the “vagrant up --provision” from a playbook to control the ansible provisioning sequence of the vms, but I am trying to avoid using ugly shell commands as much as I can. If I uses a shell command I could also just trigger an ansible playbook that way, but feels wrong.
It sounds like a Vagrant issue rather than an Ansible issue. Or possibly a niche Vagrant provider problem.
Can you share a sample Vagrantfile that’s not behaving as it should and details of the target OS of the Vagrant host, and the virtualisation provider you’re using?
Vagrant is behaving fine, so not a vagrant specific problem. It is a task problem. I need the vagrant hosts fully installed first because I have to collect data from all 3 at once before deploying the software, and during software deployment I have to install the master first, collect keys and then install the slaves. Vagrant provider provisions does provide this kind of control as it assumes the each provisioned VM is self contained. A more typical solution would be to directly remote in to the VM’s for ansible to run after deployment from the remote controller, but that is not an available option. Only the vagrant host will have access to the vagrant vms, and really only as the vagrant user. The last limitation is not hard to deal with, as vagrant provides everything an ansible job would need if run from the vagrant host.
That is why I need to trigger to a vagrant host ansible playbook, since it can’t not run from the initial ansible controller. Yes it is a bit of an odd edge case, as the vagrant provider normally would be plenty.
I could make it all work with a stack of Command or Shell tasks, but that is messy and hard to keep idempotent. Which is why I am checking on a more ansible approach to trigger a task on a playbook on a remote host with the host as the new controller. Also saves inventory mess I think.
# Only execute once the Ansible provisioner,
# when all the machines are up and ready.
Is equivalent to:
# Provision all Vagrant boxes in the multi-machine setup.
# Only once all the machines are up and ready, run the Ansible provisioner
If that’s not what’s happening, that’s likely a Vagrant configuration or provisioner misbehaviour?
That’s why I’m saying this isn’t necessarily an Ansible thing. That wording, the boxes should all spin up before any Vagrant Ansible provisioner runs, you’re saying that’s not the case. That sounds like either your Vagrantfile is wrong, or your Vagrant VM provisioner or something else isn’t working as expected.
I’m spinning this up on a test but if you already have a test case/reproducer, or can provide more info on your Vagrant setup then this would collectively help people help you. If there’s an obvious error in your Vagrantfile it could be a simple fix rather than an edge case.
Definitely an edge case. Not an issue in my file atleast as written based on my understanding of the process, but possibly an issue in my understanding of how vagrant is executing ansible as it looks like vagrant runs on each vm as a separate job in either case, just in parallel on each the second time. I still need control over the process to pull data from host 1 to be used on host 2 and 3, which if it is running in parallel as multiple jobs would still be an issue. If it in fact runs a single ansible playbook across the inventory, then that could work, and be the opposite of how I am understanding vagrant ansible provider works. I would need to refactor a large chunk of the application code to support that, but that can be easily done.
Where I’ve had scenarios provisioning 3 nodes of something in a 2n+1 cluster (basically anything like Mongo, Etcd, Zookeeper etc. etc.) and you need to at least temporarily choose a semi-deterministic primary I’ve used logic like:
pre_tasks:
name: pre_tasks | cluster member role setup for multiple hosts
block:
name: pre_tasks | set cluster role to primary when inventory_hostame matches random seed
set_fact:
cluster_role: primary
when: inventory_hostname == ansible_play_hosts|random(seed=ansible_play_hosts | join())
name: pre_tasks | set mongo replication role to secondary when inventory_hostame does not match random seed
set_fact:
cluster_role: secondary
when: inventory_hostname != ansible_play_hosts|random(seed=ansible_play_hosts | join())
name: pre_tasks | create a custom facts.d directory on the target host
file:
state: directory
recurse: true
path: /etc/ansible/facts.d
name: pre_tasks | persist the cluster membership role as a custom fact
copy:
content: |
{‘cluster_role’:‘{{ cluster_role }}’}
dest: /etc/ansible/facts.d/cluster.fact
mode: 0644
owner: root
group: root
Warning! This sets a transient value in facts.d. Which in my cases is fine for our purposes. If your cluster membership state changes post-setup, the fact would be misleading. (i.e. a node flaps and another cluster member assumes leader/primary.)
You would want to replace cluster.fact with something that dynamically pulls out the cluster role membership state of a node once the cluster/replicaset/whatever topology is provisioned and configured.
OK, after some experimentation, I think I see what your problem might be? If you do something like:
BOX_IMAGE = "fedora/37-cloud-base"
NODE_COUNT = 2
Vagrant.configure("2") do |config|
(1..NODE_COUNT).each do |i|
config.vm.define "node#{i}" do |subconfig|
subconfig.vm.box = BOX_IMAGE
subconfig.vm.hostname = "node#{i}"
if i == NODE_COUNT
config.vm.provision :ansible do |ansible|
# Disable default limit to connect to all the machines
ansible.limit = "all"
ansible.playbook = "playbook.yml"
end
end
end
end
end
The Vagrant Ansible provisioner fires for every VM causing multiple discrete runs, you can control that to a degree with ansible.limit, the hosts statement in the playbook and/or delegate_to but it would be hard to do stateful cross-cluster config.
If you do something like the following instead, this will provision all 3 Vagrant boxes and then fire the provisioner once triggering an Ansible run just for the final box:
wmcdonald@fedora:~/working/vagrant/fedora-multi$ cat Vagrantfile
Vagrant.configure(2) do |config|
#Define the number of nodes to spin up
N = 3
#Iterate over nodes
(1..N).each do |node_id|
nid = (node_id - 1)
config.vm.define "node#{nid}" do |node|
node.vm.box = "fedora/37-cloud-base"
node.vm.provider "virtualbox" do |vb|
vb.memory = "1024"
end
node.vm.hostname = "node#{nid}"
if node_id == N
node.vm.provision "ansible" do |ansible|
ansible.limit = "all"
ansible.groups = {
"cluster-nodes" => [
"node0",
"node1",
"node2",
]
}
ansible.playbook = "playbook.yml"
end
end
end
end
end
wmcdonald@fedora:~/working/vagrant/fedora-multi130$ cat playbook.yml
- name: Vagrant post-provision
hosts: cluster_nodes
tasks:
- name: Debug vars for hosts
debug:
var: ansible_play_hosts
Note that the provisioner will run once but still parallelise like a normal Ansible run would and hit each node because we’re setting the hosts to the group members. You could further limit with delegate_to or have one cluster node in its own ‘primary_node’ group in addition to the cluster_nodes.
Will-
That was exactly the issue. I will give the bottom solution a go. I think that will work, I will need to play with generating the group, but I think t I can make it work. Thanks for the help, will update when I get something working or fail
Will-
Looks like even with the cluster limit I still get 3 discrete runs, when using the cluster example. I did a very simple play book and you can see the gathering_facts stages gets run in triplicate:
WIll-
I see it, I loop over mine differently. Essentially your configured to run for a single node regardless of node count and always the last one. This works but since you are only triggering on a single node, potentially cleaner to just trigger on a single node, something like this:
guestOS= “fedora/38-cloud-base”
Vagrant.configure(“2”) do |config|
config.vagrant.plugins = “vagrant-libvirt”
config.vm.provider “libvirt” do |vb|
vb.memory = “2048”
vb.cpus = “1”
end
%w{worker1 worker2}.each_with_index do |name, i|
config.vm.define name do |worker|
worker.vm.box = guestOS
worker.vm.hostname = name
Saves the loop headache, but only triggers on a full re-population, dropping and rebuilding a single node will not reprovision the node. That gets me about half way to what I want, vagrantfile can be controlled by the master playbook as a template. Wonder how much logic I can build into the vagrant file provision loop.
Will-
This has been overall very helpful. I think I have a cleaner way to implement my idea now. A little reworking of the master ansible playbook and I think I can get things to work the way I will need them too.