Ansible baby steps failure. Can’t even ‘-m ping’ ;-(
I’m a first time Ansible user and starting from the beginning.
The first real task from the Introduction to Ansible is a connectivity test.
After setting everything up the ping is bailing out on me with this generic error message:
"msg": "MODULE FAILURE: No start of json char found\nSee stdout/stderr for the exact error",
It may be helpful to tell us what Ansible command string you were running. Assuming that you are using the ping module, it should look like this:
➜ tmp ansible -i inventory -m ping serverb
[WARNING]: Platform linux on host serverb is using the discovered Python interpreter at /usr/bin/python3.9, but future installation of another Python interpreter
could change the meaning of that path. See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
serverb | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3.9"
},
"changed": false,
"ping": "pong"
}
If you are invoking the command directly, something like this should be expected:
➜ tmp ansible -i inventory -m command -a 'uptime' serverb
[WARNING]: Platform linux on host serverb is using the discovered Python interpreter at /usr/bin/python3.9, but future installation of another Python interpreter
could change the meaning of that path. See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
serverb | CHANGED | rc=0 >>
18:15:02 up 10 days, 9:00, 1 user, load average: 0.02, 0.02, 0.00
I ran the echo command to see if I can reproduce what you are getting, but I am not:
➜ tmp ansible -i inventory -m command -a 'echo hello' serverb
[WARNING]: Platform linux on host serverb is using the discovered Python interpreter at /usr/bin/python3.9, but future installation of another Python interpreter
could change the meaning of that path. See https://docs.ansible.com/ansible-core/2.17/reference_appendices/interpreter_discovery.html for more information.
serverb | CHANGED | rc=0 >>
hello
The outcome is the same for root and non-root. I’ve tried it against other servers in my small home network. I’ve got the following node results:
Leap 15.4: OK
Leap 15.5: Fail
Leap 15.6: Fail
Leap 15.6: OK
Leap Micro 16.1: OK
I’m on a limb saying it’s a local openSUSE issue. I need to dig further. And I’ve got no idea yet where to look. But it’s good to know it could work Thanks for your help so far!
TL;DR
What OS are you logging into and as what user and what shell is that user configured with?
run grep $USER /etc/passwd
and the last item in the output should be your shell.
If the echo command is not found, you’re probably using a very limited shell on a system with very limited commands available or your PATH. Most shells will have echo as a builtin and most systems will also have /bin/echo as a command.
The user you are logging in with might have a high-security, limited shell for sftp access only or some other specific purpose shell. It could even be set to ‘/bin/true’ if the user is not intended for logins, but for running services only, like ‘www’.
All users are regular users.
I did think of security settings. openSUSE is transitioning from AppArmor to SELinux, but if you install a default server/desktop, you’ll get the medium settings with lots of wiggle room. In fact the Leap 15.4 which works has AppArmor enabled. And the Leap Micro has SELinux enabled. So I’m doubting this is the root cause. But I’m not knowledgeable enough in that department to rule it out completely. I’m saying this as ‘echo’ is a shell built-in. I don’t know how that works security wise. May I should hack the command and try ‘/bin/echo’
Np. When it’s a bit late, dumb questions are forgiven
So the day before yesterday I tracked it down as a Python incompatibility between the Python version on the Ansible controller and the system wide Python version on the node.
[rant]
I hate it when there’s no obvious coverage of system requirements
And also the incredible arcane error messages.
And the stupidity that an automation system is not able to figure out what the capabilities of a nod are.
[/rant]
I have it working now.
The issue with the echo ~ && sleep 0 command in Ansible’s SSH connectivity test likely stems from a misconfigured shell environment or Python incompatibility on the target node. To resolve this, ensure the default shell on the target machine (monitor.lan) is correctly set to /bin/bash (verify with getent passwd ). Additionally, confirm that both the Ansible controller and the target node use compatible Python versions (e.g., Python 3.6+ for Ansible on openSUSE Leap 15.6). Update the Ansible configuration file (ansible.cfg) to specify the correct Python interpreter on the target with ansible_python_interpreter=/usr/bin/python3. Test connectivity with ansible all -m ping after making these changes.
The long version is that I’ve discoverd that Ansible tells you what it supports if you run ansible-config list|grep -i python
On the drone node I had Python v3.6 from the distribution. I installed 3.11 along side. That solved everything on the controller node.