Issues with appened group not being recognised immediately

Hi all,

I have a playbook for provisioning some vagrant machines, written following the roles approach as described the “Best Practices” in the website documenation. The first the role goes and installs docker in all VMs, as part of that it adds the user vagrant to the “docker” group, so it can have access to the /var/run/docker.sock file, which is required to run command docker. After that is finished, another specific role specific, try to build the appropriate docker images for each machine, based on a pushed Dockerfile.

The problem is that, even though the user is properly added to the “docker” group, when the play tries to build the docker image I get an error that you can only get if you don’t have the right access to /var/run/docker.sock.

If I rerun the provisioning “vagrant provision”, it works fine. If I login by hand and run the command, it also works well.

I suspect that the problem is that Ansible is reusing the original connection, when it installed docker and added the user to the “docker” as part of the tasks of the previous role. If this is the case, the change of adding the user to the “docker” group would not be effective yet. By putting -vvvv I can see ControlPersist=60s. Is there any way to get around this issue?

I know I could tweak ControlPersist but this is a naughty hack. Hopefully there is a better solution.

Please, bear with me as I’m new to Ansible. Any ideas are appreciated.

Kind regards,
Juan

you might want to disable control master/persist for this as it does
reuse the ssh connection.

Thanks Brian,

That’s a way, there were two other ways I could do that:
a) Wait for 60s after adding the users to the group, so the connections would time out and the rest of the plays would work as expected
b) Use sudo for the tasks that required the user docker

In the end I decided to go for a), the wait is a bit of annoying but worth the effort as it avoids possible undesirable side effects with sudo. I guess I could always combine the wait with a reduced control persist too to reduce the wait period.

A shame that Ansible doesn’t cater for this situation out of the box.

Thanks,
Juan

I'm thinking we should add a feature:

meta: flush-connections

to let the play auto expire connection caches for these cases

+1 :slight_smile:

Think that this scenario will always be a problem for someone adding a user to a new group, when belonging to that group is important to for the rest of the playbook orchestration. A good approach could be forcing a reconnect after using groups with append=yes in the “user” module.

Thanks,
Juan