ansible stuck in celery

hello,all

i am using celery to run ansible-playbook , my env

  • Python 2.7.5
  • celery==4.0.1
  • ansible 2.2.0.0
  • redhat7.0

i use celery to run playbook to change passwd, the playbook task will stuck from time to time, when i kill the celery worker, then the task continues to run . i cant find any error msg in celery log file;

the only thing i can find is that the problem relates to the number of hosts in inventory,
50 hosts may be ok, 100 hosts definitely get this problem;

anyone happened to this? the problem really frustrated me ;

this is my playbook

I suggest checking your system logs to see if you are ok for resources / forks / open files etc.

What are you setting forks to (probably in your ansible.cfg if not specifed on the ansible-playbook command line? You might be asking a lot of your celery / ansible controller machine.

Jon

i modified the nofile and nproc limit in /etc/security/limits.conf to 65535, but still not getting any better,

then i changed the forks option in ansible.cfg from 20 to 2, nothing changed;

it appears that when the stuck happens, some ansible-playbook processes become defunct

yd_hzj 52703 52573 10 16:50 pts/21 00:00:09 ansible-playbook chpwd.yml -e {“ansible_become_method”: “su”, “user_name_lists”: …} -i hosts
yd_hzj 54571 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]
yd_hzj 54575 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]
yd_hzj 54595 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]
yd_hzj 54600 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]
yd_hzj 54616 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]
yd_hzj 54625 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]
yd_hzj 54628 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]
yd_hzj 54631 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]
yd_hzj 54643 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]
yd_hzj 54647 52703 0 16:51 pts/21 00:00:00 [ansible-playboo]

any help would be appreciated