Connection errors for my first self written ActionModule

General warning before: I’m quite new to Python and Ansible as a whole, coming from strongly typed languages I’m having some issues with the dynamic nature of Python.

I’m writing my first ActionModule. It does exactly what it’s supposed to do, the first time it’s used from a play. The second time it’s called from the same play, it gives the error:

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AttributeError: 'NoneType' object has no attribute 'gettimeout'
fatal: [172.16.18.22]: FAILED! => {"msg": "Unexpected failure during module execution: 'NoneType' object has no attribute 'gettimeout'", "stdout": ""}

Ansible is not the newest version, this is the version the company I work at uses, I can’t update it

ansible [core 2.16.6]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/etc/ansible/projects/cpe-provisioning/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.11/site-packages/ansible
  ansible collection location = /etc/ansible/ansible-common:/etc/ansible/ansible-common/ansible_collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.11.10 (main, Dec  3 2024, 02:25:00) [GCC 12.2.0] (/usr/local/bin/python3)
  jinja version = 3.1.6
  libyaml = True

I’m working with Nokia machines and their code from github: nokia/ansible-networking-collections
This code has not been updated in a while
I’m testing on nokia.sros.classic and connection ansible.netcommon.network_cli

Other things I tried:

  • using conn.get() (same issue)
  • using conn.send_command() a second time in the same method (works)
  • using ansible.netcommon.cli_command twice in the same playbook (works)

My code:

from ansible.errors import AnsibleActionFail, AnsibleAuthenticationFailure, AnsibleConnectionFailure
from ansible.plugins.action import ActionBase
from ansible_collections.ansible.netcommon.plugins.cliconf.default import CliconfBase
import re
from typing import cast

class ActionModule(ActionBase):
    BOTH_DIRECTORY_LINE = re.compile(r'^(?P<date>\d{2}/\d{2}/\d{4})  (?P<time>\d{2}:\d{2}(a|p))\s*<DIR>\s*(?P<name>.*)$', re.MULTILINE)
    BOTH_FILE_LINE = re.compile(r'^(?P<date>\d{2}/\d{2}/\d{4})  (?P<time>\d{2}:\d{2}(a|p))\s*(?P<size>\d*) (?P<name>.*)$', re.MULTILINE)
    BOTH_LISTED_DIRECTORY = re.compile(r'^Directory of (?P<dir_name>\S+)$', re.MULTILINE)
    CLASSIC_FILE_NOT_FOUND = re.compile(r'^MINOR.*?File Not Found.*?$', re.MULTILINE)
    MD_FILE_NOT_FOUND = re.compile(r'^MINOR:.*?open directory.*?$', re.MULTILINE)

    _requires_connection = True
    _supports_check_mode = False

    def _get_file_stats(self, task_vars) -> dict:
        conn: CliconfBase = self._connection # NOTE: It's a network_cli.Connection, but it proxies all unknown attributes to Cliconf
        output: dict = {}
        command_output = ''
        if cast(str, task_vars['ansible_network_os']).endswith('classic'):
            file_command = 'dir'
        else:
            file_command = 'list'
        try:
            command_output = conn.send_command(f'file {file_command} {self._task.args["path"]}')
        except AnsibleConnectionFailure as e: # The way Nokia has written Nokia.sros.classic and .md means that a file not found is an AnsibleConnectionFailure
            error_message = ''
            match_classic = self.CLASSIC_FILE_NOT_FOUND.search(e.message)
            if match_classic:
                error_message = match_classic.group(0)
            match_md = self.MD_FILE_NOT_FOUND.search(e.message)
            if match_md:
                error_message = match_md.group(0)
            if error_message:
                output = {'exists': False, 'error_message': error_message}
            else:
                raise # It's a real connection failure
        if command_output:
            listed_directory = self.BOTH_LISTED_DIRECTORY.search(command_output).group('dir_name')
            if listed_directory.endswith('\\'):
                print('directory')
                output['directory'] = True
                files = []
                for file in self.BOTH_FILE_LINE.finditer(command_output):
                    files.append({
                        'date': file.group('date'),
                        'time': file.group('time'),
                        'size': file.group('size'),
                        'name': file.group('name')
                    })
                directories = self.BOTH_DIRECTORY_LINE.findall(command_output)
            else:
                print('file')
                output['directory'] = False
                match = self.BOTH_FILE_LINE.search(command_output)
                output['date'] = match.group('date')
                output['time'] = match.group('time')
                output['size'] = match.group('size')
                output['name'] = match.group('name')
        return output


    def run(self, tmp=None, task_vars=None):
        self.validate_argument_spec(
            argument_spec=dict(
                state=dict(type='str', required=True),
                name=dict(type='str', required=True),
                path=dict(type='str')
             )
        )
        if not task_vars['ansible_network_os'] in ['nokia.sros.classic','nokia.sros.md']:
            raise AnsibleActionFail(message='ansible_network_os must be nokia.sros.classic or nokia.sros.md')
        if not task_vars['ansible_connection'] == "ansible.netcommon.network_cli":
            raise AnsibleActionFail(message='ansbile_connection must be ansible.netcommon.network_cli')
        result = super(ActionModule, self).run(tmp, task_vars)
        result['retval'] = self._get_file_stats(task_vars=task_vars)
        return result

As I’m very used to strong types, I really like type hints. I’m very open to suggestions how to make this cleaner. I have looked at code from cisco.ios.ios_commands, but let’s say: “I’m not there yet”

The playbook

---
- name: testplaybook
  hosts: 172.16.18.22
  gather_facts: False
  connection: network_cli
  vars:
    ansible_network_os: nokia.sros.classic
  vars_prompt:
    - name: ansible_user
      prompt: user
      private: False
    - name: ansible_password
      prompt: password
      unsafe: True
      private: True
  tasks:
    - name: test exiting file Classic
      proxsys.nokia.directory:
        state: code not there yet
        name: code not there yet
        path: cf1:config.cfg
      register: test1_output
    - name: print test1_output
      ansible.builtin.debug:
        var: test1_output
    - name: test existing file MD
      proxsys.nokia.directory:
        state: code not there yet
        name: code not there yet
        path: cf3:config.cfg
      register: test2_output
    - name: print test2_output
      ansible.builtin.debug:
        var: test2_output

For this problem, I really would like to know why I can’t call the action module twice in a row from the same play.

If you run with -vvv you will hopefully see the stacktrace pointing to code that is calling gettimeout and how it got there.

I have run the code with -vvv and there is a stacktrace, it seems to first go wrong in /usr/share/ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py

TASK [test existing file MD] **************************************************************************************************************************************************************
task path: /workspaces/Werkgroep - Netwerken - NetDevSamples/Mathijs/collection_test/test2-playbook.yml:63
The full traceback is:
Traceback (most recent call last):
  File "/usr/share/ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py", line 1003, in send
    self._ssh_shell.sendall(cmd)
    ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'sendall'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/ansible/executor/task_executor.py", line 165, in run
    res = self._execute()
          ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/ansible/executor/task_executor.py", line 637, in _execute
    result = self._handler.run(task_vars=vars_copy)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspaces/Werkgroep - Netwerken - NetDevSamples/Mathijs/collection_test/collections/ansible_collections/proxsys/nokia/plugins/action/directory.py", line 77, in run
    result['retval'] = self._get_file_stats(task_vars=task_vars)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/workspaces/Werkgroep - Netwerken - NetDevSamples/Mathijs/collection_test/collections/ansible_collections/proxsys/nokia/plugins/action/directory.py", line 26, in _get_file_stats
    command_output = conn.send_command(f'file {file_command} {self._task.args["path"]}')
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/ansible/plugins/cliconf/__init__.py", line 127, in send_command
    resp = self._connection.send(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/share/ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py", line 345, in wrapped
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/share/ansible/collections/ansible_collections/ansible/netcommon/plugins/connection/network_cli.py", line 1034, in send
    % (self._ssh_shell.gettimeout(), command.strip())
       ^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'gettimeout'
fatal: [172.16.18.22]: FAILED! => {
    "msg": "Unexpected failure during module execution: 'NoneType' object has no attribute 'gettimeout'",
    "stdout": ""
}

PLAY RECAP ********************************************************************************************************************************************************************************
172.16.18.22               : ok=2    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   

I have also changed my code a bit to check the value of conn._conn_closed, this returns False, so there is a connection object, and it’s not closed.