> Well, I tried and it's working properly with Python3. There is no problem with
> Ansible 2.9.6 and Python 3.8 on the controller
That is not true and has nothing to do with Python version, and everything with
a little luck and a fast computer.
Right. Python is irrelevant. Your results proved it. I took the slowest
one I've got (Intel(R) Core(TM) i5-8200Y CPU @ 1.30GHz) XPS 13 9365 Ubuntu
20.04.
The test case is flawed for your Ansible controller, if it had more load and/or
more hosts you would get a completely different result.
Right. It took ~20 hosts to see the problems. I've run the test 10 times with
the result failed:6 ok:4 (number of hosts written
(18,19,18,19,19,19,20,20,20,20).
The reason why this i not reliable is that Ansible is default running forks of
5, that mean that if you had 5 or more hosts it will run/fork out 5 commands.
This 5 commands will try to edit the same file at the same time.
I'm not sure if this applies also to the task delegated to the localhost. The
6th host started to write about the time the 1st one finished. The playbook
below
- hosts: all
gather_facts: false
tasks:
- lineinfile:
path: "{{ playbook_dir }}/test_file"
line: "{{ inventory_hostname }}"
create: true
delegate_to: localhost
gave the "runner_on_ok" events (counter,pid,host,start,end,duration) in the
table below selected from the ansible-runner artifacts.
010 104710 test_01 27:33.764342 27:34.826178 1.061836
012 104710 test_04 27:33.804294 27:34.892333 1.088039
014 104710 test_05 27:33.831383 27:34.983917 1.152534
016 104710 test_03 27:33.790178 27:35.017124 1.226946
018 104710 test_02 27:33.774777 27:35.048550 1.273773
020 104710 test_06 27:34.808575 27:35.716538 0.907963
022 104710 test_07 27:34.884775 27:35.788730 0.903955
024 104710 test_08 27:34.969329 27:35.819368 0.850039
026 104710 test_09 27:35.006282 27:35.906473 0.900191
028 104710 test_10 27:35.042953 27:35.974963 0.93201
030 104710 test_12 27:35.781438 27:36.713863 0.932425
032 104710 test_11 27:35.708957 27:36.747040 1.038083
034 104710 test_14 27:35.894955 27:36.770536 0.875581
036 104710 test_13 27:35.811156 27:36.793905 0.982749
038 104710 test_15 27:35.967495 27:36.921133 0.953638
039 104710 test_17 27:36.737726 27:37.517073 0.779347
040 104710 test_16 27:36.704352 27:37.524634 0.820282
041 104710 test_19 27:36.786723 27:37.533726 0.747003
042 104710 test_18 27:36.763620 27:37.549728 0.786108
043 104710 test_20 27:36.914053 27:37.694404 0.780351
Link to 200 (10 x playbook x 20 hosts) events sorted by the end-time
https://gist.github.com/vbotka/31ccd5380a2b230be11b1df78cc402a1
(incl. the playbook to select events and the script to run the test)
Reason it works for you is that the first fork is finished before the second
has time to start.
Reported start-end times don't confirm this. A lower-level understanding is
needed to interpret the results, I think.
This is how Ansible work, and now Python version is going to alter that.
And I'm going to prove it.
Right. You proved the version of Python is not relevant in this case.
[...]
Yeah, it fails miserably, in 99% of the cases.
The correct md5sum is this one
I can not confirm such a high percentage of failures.
So no combination of Ptyhon and Ansible released will alter this,
it's just how it work.
That is why we have serial, fork and throttle to tune the behavior to avoid race
conditions.
Right. It would be good to understand who and when is responsible for what.
Thank you,
-vlado