In AWX (UI version 24.6.0 ), when connecting to hosts, we get intermittent error:
fatal: [example.server.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to create temporary directory. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \"/tmp\", for more error information use -vvv. Failed command was: ( umask 77 && mkdir -p \"` echo /home/test_folder/.ansible/tmp `\"&& mkdir \"` echo /home/test_folder/.ansible/tmp/ansible-tmp-1729589857.763045-108-198953876447411 `\" && echo ansible-tmp-1729589857.763045-108-198953876447411=\"` echo /home/test_folder/.ansible/tmp/ansible-tmp-1729589857.763045-108-198953876447411 `\" ), exited with result 1", "unreachable": true}
This is happening intermittently. We have validated the network connectivity from the node, where the execution environment is being spun up and the commands are getting executed.
My playbook is supposed to check the uptime of the Linux servers, and I’m getting the above error in gather facts stage itself.
If i try to ssh to the target (example.server.com) server from the node, manually it works all the time. What might be causing the issue.
The load on the node is at low. There is no other heavy process being executed.
The error says it’s encountering a problem creating the temporary directories so connectivity and ssh checks aren’t relevant. Go to the problematic target nodes and troubleshoot why your Ansible user is unable to create files and directories where the error says it’s trying to create them. You can copy and paste that big long command that starts with umask.
@mcen1 when i launch the awx template, out of 10 time, 5 times i get the error. the other 5 times, it works fine. I have ran the template within a duration of 30min and we have monitor the target server and there was no modification or any other problem
As I’m using AWX tool, and when i execute the template, an execution environment will be used in the execution node (i.e, container running on the execution node/server) not sure where the ansible.cfg needs to be modified. Can you please share more info on this.
It looks to me as though a job was used on the target node which logged in as a different user and the job you are trying to execute now is using the same directory but has no permission to write therein. (In /home/test_folder/) Edit: if that name hasn’t been redacted it would indicate that any job template user is actually using that directory and permission problems would occur.
As was pointed out earlier you can reconfigure the remote_tmp in an ansible.cfg file in an execution environment you create, but that’s non-trivial and you might not have permission to actually use that EE in your environment.
Can you determine whether other playbooks have been running against the node using different users to login with? And if so, can you run your job template as that user?