Should the systemd_service module include the reset_failed option?

Hello everyone,

Some friends have pointed out to me that using Ansible to manage systemd services can sometimes lead to unexpected results.

More specifically, services with StartLimitBurst may fail to start because the unit has exceeded the failed limit.

What does this mean?

The module does its job well, but unfortunately, systemctl commands fail when a service exceeds the limits set by StartLimitBurst and StartLimitIntervalSec.

As a result, the failure of these commands is reflected in the module’s failure.

For testing, I used a modified version of the ansible_test service.

[Unit]
Description=Ansible Test Service
StartLimitBurst=1
StartLimitIntervalSec=60

[Service]
ExecStart=/usr/sbin/ansible_test_service "Test\nthat newlines in scripts\nwork"
ExecReload=/bin/true
Restart=on-failure
Type=forking
PIDFile=/var/run/ansible_test_service.pid

[Install]
WantedBy=multi-user.target
# systemctl status ansible_test.service
× ansible_test.service - Ansible Test Service
     Loaded: loaded (/etc/systemd/system/ansible_test.service; enabled; preset: disab
     Active: failed (Result: signal) since Fri 2025-02-21 18:53:10 CET; 17min ago
   Duration: 4.504s
    Process: 1887 ExecStart=/usr/sbin/ansible_test_service Test
that newlines in scripts
work (code=exited, status=0/SUCCESS)
   Main PID: 1889 (code=killed, signal=KILL)
        CPU: 892ms

Feb 21 18:53:10 ansiblecn systemd[1]: ansible_test.service: Scheduled restart job, re
Feb 21 18:53:10 ansiblecn systemd[1]: Stopped Ansible Test Service.
Feb 21 18:53:10 ansiblecn systemd[1]: ansible_test.service: Start request repeated to
Feb 21 18:53:10 ansiblecn systemd[1]: ansible_test.service: Failed with result 'signa
Feb 21 18:53:10 ansiblecn systemd[1]: Failed to start Ansible Test Service.
Feb 21 18:53:10 ansiblecn systemd[1]: ansible_test.service: Start request repeated to
Feb 21 18:53:10 ansiblecn systemd[1]: ansible_test.service: Failed with result 'signa
Feb 21 18:53:10 ansiblecn systemd[1]: Failed to start Ansible Test Service.
TASK [try start after fail] *************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Unable to start service ansible_test: 
Job for ansible_test.satus ansible_test.service\" and \"journalctl -xeu ansible_test.service\" 
for details.\n"}

Would it be useful to handle reset-failed directly within the systemd_service module?

For testing, I created a local copy of the module and introduced a new boolean option, reset_failed, along with a new code block that performs a reset-failed on the service before executing state operations.

TASK [try start after fail] **************************************************************
changed: [localhost] => {"changed": true, "name": "ansible_test", "state": "started",...}}

The rest of the module remains unchanged.

Would this change be useful? Would it be worth opening an issue and a related PR?

1 Like

I’d find it useful, I have a handler for this in one role:

    - name: Reset failed  # noqa: no-changed-when
      ansible.builtin.command: systemctl reset-failed
      listen: Reset failed