Execute new ansible-playbook command on remote host from a playbook

I am working on a scenario where the first playbook executes commands on a remote host to create a vagrant host and spins up multiple vms. Vagrant can triggers it’s own ansible provisioning runs but they are only single host aware and run when the host is provisioned. That does not work in this case, as I need all VM’s running BEFORE the deployment playbook can be triggered. Added wrinkle is the VMs are accessible at this time from outside the vagrant host. If they were, I could simply import the vagrant host list into the controller inventory and refresh.

Right now I am looking at the possibility of using ansible.builtin.shell to trigger the new ansible-playbook command on the vagrant host to run the vagrant VM application configuration. But while this works it is not exactly ansible clean. Suggestions on approaches?

Quickly skimming the Vagrant Ansible provisioner docs, isn’t this precisely the behaviour you’re looking for:

https://developer.hashicorp.com/vagrant/docs/provisioning/ansible#ansible-parallel-execution

    # Only execute once the Ansible provisioner,
    # when all the machines are up and ready.

So you would spin up all your Vagrant boxes from your control node, wait for that to complete, template out a static inventory of your Vagrant boxes then run your subsequent Vagrant Ansible provisioner automation?

Already tried it and it does not work, which was why I explicitly referenced that behavior as not working as not working in this scenario.While vagrant can run playbooks at provisioning time. it does not really proivde a way to to control when the provisioin runs. All 3 hosts need to be up be for the first host can be provisioned since it requires the ips of the later hosts. Second option does not work, as the remote control node does not have access to the VMs, as mentioned. Which is what lead to the need to trigger a second playbook. otherwise could lust load the vagrant generated inventory with add_host module.

IC ould do some ugly sequencing of the “vagrant up --provision” from a playbook to control the ansible provisioning sequence of the vms, but I am trying to avoid using ugly shell commands as much as I can. If I uses a shell command I could also just trigger an ansible playbook that way, but feels wrong.

It sounds like a Vagrant issue rather than an Ansible issue. Or possibly a niche Vagrant provider problem.

Can you share a sample Vagrantfile that’s not behaving as it should and details of the target OS of the Vagrant host, and the virtualisation provider you’re using?

Vagrant is behaving fine, so not a vagrant specific problem. It is a task problem. I need the vagrant hosts fully installed first because I have to collect data from all 3 at once before deploying the software, and during software deployment I have to install the master first, collect keys and then install the slaves. Vagrant provider provisions does provide this kind of control as it assumes the each provisioned VM is self contained. A more typical solution would be to directly remote in to the VM’s for ansible to run after deployment from the remote controller, but that is not an available option. Only the vagrant host will have access to the vagrant vms, and really only as the vagrant user. The last limitation is not hard to deal with, as vagrant provides everything an ansible job would need if run from the vagrant host.

That is why I need to trigger to a vagrant host ansible playbook, since it can’t not run from the initial ansible controller. Yes it is a bit of an odd edge case, as the vagrant provider normally would be plenty.

I could make it all work with a stack of Command or Shell tasks, but that is messy and hard to keep idempotent. Which is why I am checking on a more ansible approach to trigger a task on a playbook on a remote host with the host as the new controller. Also saves inventory mess I think.

I think you may be misunderstanding me, or I’m misunderstanding you.

Just for clarity’s sake, the flow you would like is:

  1. An Ansible control node runs a playbook (or role) on targeting a machine,

  2. The is configured to run as a Vagrant host with a virtualisation provider (Virtualbox, Libvirt or whatever) in order to support Vagrant box creation

  3. You then have a Vagrantfile which runs on and configures multiple Vagrant boxes , ,

  4. Once , , are UP and only then, you want to run some Ansible which needs the primary and 2 secondaries to be up
    That being the case, then that is the behaviour that https://developer.hashicorp.com/vagrant/docs/provisioning/ansible#ansible-parallel-execution describes. It’s slightly poorly worded but to me:

    # Only execute once the Ansible provisioner,
    # when all the machines are up and ready.

Is equivalent to:

    # Provision all Vagrant boxes in the multi-machine setup.
    # Only once all the machines are up and ready, run the Ansible provisioner

If that’s not what’s happening, that’s likely a Vagrant configuration or provisioner misbehaviour?

That’s why I’m saying this isn’t necessarily an Ansible thing. That wording, the boxes should all spin up before any Vagrant Ansible provisioner runs, you’re saying that’s not the case. That sounds like either your Vagrantfile is wrong, or your Vagrant VM provisioner or something else isn’t working as expected.

I’m spinning this up on a test but if you already have a test case/reproducer, or can provide more info on your Vagrant setup then this would collectively help people help you. If there’s an obvious error in your Vagrantfile it could be a simple fix rather than an edge case.

cf:

Definitely an edge case. Not an issue in my file atleast as written based on my understanding of the process, but possibly an issue in my understanding of how vagrant is executing ansible as it looks like vagrant runs on each vm as a separate job in either case, just in parallel on each the second time. I still need control over the process to pull data from host 1 to be used on host 2 and 3, which if it is running in parallel as multiple jobs would still be an issue. If it in fact runs a single ansible playbook across the inventory, then that could work, and be the opposite of how I am understanding vagrant ansible provider works. I would need to refactor a large chunk of the application code to support that, but that can be easily done.

There are a couple of ways you could exercise “control over the process to pull data from host 1 to be used on host 2 and 3”.

If you look at https://manski.net/2016/09/vagrant-multi-machine-tutorial/#multi-machine.3A-the-clever-way 3 nodes are provisioned, one as primary, then two as secondary nodes and it’d be relatively trivial to use this to key off the ‘primary’ node to do what you needed, I imagine.

Where I’ve had scenarios provisioning 3 nodes of something in a 2n+1 cluster (basically anything like Mongo, Etcd, Zookeeper etc. etc.) and you need to at least temporarily choose a semi-deterministic primary I’ve used logic like:

pre_tasks:

  • name: pre_tasks | cluster member role setup for multiple hosts
    block:

  • name: pre_tasks | set cluster role to primary when inventory_hostame matches random seed
    set_fact:
    cluster_role: primary
    when: inventory_hostname == ansible_play_hosts|random(seed=ansible_play_hosts | join())

  • name: pre_tasks | set mongo replication role to secondary when inventory_hostame does not match random seed
    set_fact:
    cluster_role: secondary
    when: inventory_hostname != ansible_play_hosts|random(seed=ansible_play_hosts | join())

  • name: pre_tasks | create a custom facts.d directory on the target host
    file:
    state: directory
    recurse: true
    path: /etc/ansible/facts.d

  • name: pre_tasks | persist the cluster membership role as a custom fact
    copy:
    content: |
    {‘cluster_role’:‘{{ cluster_role }}’}
    dest: /etc/ansible/facts.d/cluster.fact
    mode: 0644
    owner: root
    group: root

Warning! This sets a transient value in facts.d. Which in my cases is fine for our purposes. If your cluster membership state changes post-setup, the fact would be misleading. (i.e. a node flaps and another cluster member assumes leader/primary.)

You would want to replace cluster.fact with something that dynamically pulls out the cluster role membership state of a node once the cluster/replicaset/whatever topology is provisioned and configured.

OK, after some experimentation, I think I see what your problem might be? If you do something like:

BOX_IMAGE = "fedora/37-cloud-base"
NODE_COUNT = 2

Vagrant.configure("2") do |config|
  
  (1..NODE_COUNT).each do |i|
    config.vm.define "node#{i}" do |subconfig|
      subconfig.vm.box = BOX_IMAGE
      subconfig.vm.hostname = "node#{i}"

      if i == NODE_COUNT
        config.vm.provision :ansible do |ansible|
          # Disable default limit to connect to all the machines
          ansible.limit = "all"
          ansible.playbook = "playbook.yml"
        end
      end

    end
  end

end

The Vagrant Ansible provisioner fires for every VM causing multiple discrete runs, you can control that to a degree with ansible.limit, the hosts statement in the playbook and/or delegate_to but it would be hard to do stateful cross-cluster config.

If you do something like the following instead, this will provision all 3 Vagrant boxes and then fire the provisioner once triggering an Ansible run just for the final box:

wmcdonald@fedora:~/working/vagrant/fedora-multi$ cat Vagrantfile
Vagrant.configure(2) do |config|
  #Define the number of nodes to spin up
  N = 3

  #Iterate over nodes
  (1..N).each do |node_id|
    nid = (node_id - 1)

    config.vm.define "node#{nid}" do |node|
      node.vm.box = "fedora/37-cloud-base"
      node.vm.provider "virtualbox" do |vb|
        vb.memory = "1024"
      end
      node.vm.hostname = "node#{nid}"

      if node_id == N
        node.vm.provision "ansible" do |ansible|
          ansible.limit = "all"
          ansible.groups = {
            "cluster-nodes" => [
              "node0",
              "node1",
              "node2",
            ]
          }
          ansible.playbook = "playbook.yml"
        end
      end

    end
  end
end

wmcdonald@fedora:~/working/vagrant/fedora-multi130$ cat playbook.yml 
- name: Vagrant post-provision
  hosts: cluster_nodes

  tasks:
    - name: Debug vars for hosts
      debug:
        var: ansible_play_hosts

Note that the provisioner will run once but still parallelise like a normal Ansible run would and hit each node because we’re setting the hosts to the group members. You could further limit with delegate_to or have one cluster node in its own ‘primary_node’ group in addition to the cluster_nodes.

See: https://everythingshouldbevirtual.com/automation/virtualization/vagrant-ansible-provisioning-multi-nodes/
And another variant with per-box behaviour here: https://stackoverflow.com/questions/54468546/how-to-run-an-ansible-playbook-on-a-specific-vagrant-host

Will-
That was exactly the issue. I will give the bottom solution a go. I think that will work, I will need to play with generating the group, but I think t I can make it work. Thanks for the help, will update when I get something working or fail :slight_smile:

Will-
Looks like even with the cluster limit I still get 3 discrete runs, when using the cluster example. I did a very simple play book and you can see the gathering_facts stages gets run in triplicate:

Definitely changed the behavior but not quite were I need it to go. However, it has given me an interesting ideas to try.

Can you share the Vagrant file? And ideally playbook.yml?

I have this working precisely as expected, you just need to ensure that the if statement is nested at just the right point in the Vagrantfile.

(attachments)

WIll-
I see it, I loop over mine differently. Essentially your configured to run for a single node regardless of node count and always the last one. This works but since you are only triggering on a single node, potentially cleaner to just trigger on a single node, something like this:

guestOS= “fedora/38-cloud-base”

Vagrant.configure(“2”) do |config|
config.vagrant.plugins = “vagrant-libvirt”
config.vm.provider “libvirt” do |vb|
vb.memory = “2048”
vb.cpus = “1”
end

%w{worker1 worker2}.each_with_index do |name, i|
config.vm.define name do |worker|
worker.vm.box = guestOS
worker.vm.hostname = name

end

end

config.vm.define :master do |master|
master.vm.box = guestOS
master.vm.hostname = “master”
master.vm.provision “ansible” do |ansible|
ansible.verbose = “v”
ansible.limit = “all”
ansible.groups = {
“cluster-nodes” => [
“master”,
“worker1”,
“worker2”,
]
}
ansible.playbook = “bootstrap.yml”
end

end

end

Saves the loop headache, but only triggers on a full re-population, dropping and rebuilding a single node will not reprovision the node. That gets me about half way to what I want, vagrantfile can be controlled by the master playbook as a template. Wonder how much logic I can build into the vagrant file provision loop.

(attachments)

Will-
This has been overall very helpful. I think I have a cleaner way to implement my idea now. A little reworking of the master ansible playbook and I think I can get things to work the way I will need them too.

(attachments)

Awesome, glad it was helpful.

(attachments)