Managing Network Config Drift with Netbox and Ansible Part-3

Managing Network Config Drift with Ansible (Part3)

In Part 1 of this series, we established the risks of network configuration drift and the critical role a Single Source of Truth (SSoT) plays in maintaining network integrity. We demonstrated a foundational workflow where Ansible used a rendered configuration file as the SSoT, performed a diff to detect unauthorized changes, and then used a remediation playbook to bring the device back into compliance.

In Part 2 of this series, we elevated our approach to drift management by moving our Single Source of Truth (SSoT) to structured YAML data within a Git repository. We demonstrated how Ansible’s stateful and idempotent network resource modules enable a more robust workflow. By running a playbook first in check_mode, we could get a precise diff of a specific resource, and then run the same playbook again in normal mode to have Ansible intelligently remediate only the drifted configurations.

Now, in this final part, we will take the next leap by integrating a purpose-built SSoT, NetBox, and leveraging Event-Driven Ansible to create a fully automated, closed-loop system for drift remediation.

Manage Config Drift with Netbox and Ansible

Compare the intended configuration Netbox config context (structured data) to network device configurations by using event driven ansible and ansible resource modules

In this final scenario we will launch the following Ansible Automation workflow whenever Netbox triggers an rulebook in the Event Driven Ansible controller. Please note ,for this blog we are focusing on the drift management aspects of the EDA and AAP workflow and not the Netbox configuration in its entirety.

1418×586 26.3 KB

                                   Workflow

Key Take aways:

SSOT: For this demo, NetBox acts as the external network source of truth. Whenever configuration context data in NetBox’s main branch is updated, it automatically sends a webhook to the Event-Driven Ansible controller, which then triggers and launches the remediation workflow in the Automation Controller.

Drift: Drift is checked using the before state, which is derived from the switch. This is accomplished by the ntp_global network resource module’s option to run the state of replaced in check mode while comparing the current config state from the device to the intended config.

Remediation: A resource module can simply be scheduled or re-run later in run mode to actually apply the needed changes. The remediation step is depicted by the NTP Config Push (job_template) in the above workflow.

Git Repo:
The following example code for this section (playbooks , roles, rulebooks, config files) are located here

File Tree from the repo:

1007×516 70.9 KB

Demo replay In this short demo, NetBox acts as the source of truth for Cisco Catalyst switch inventory and configuration. Changes in NetBox trigger a webhook to Event-Driven Ansible, launching a workflow that automatically corrects any configuration drift.

Netbox Config Context

The premise of this demo is to establish a single source of truth with structured data configurations defined in Netbox configuration context. It’s important to realize Netbox offers other alternative options we could’ve used such as configuration templates that are rendered from jinja2 templates.

769×324 28.7 KB

                          Netbox Config Context

NetBox serves as the source of truth by storing structured data, such as the above list of approved NTP servers, within its config context feature. This data can be applied globally or targeted to specific sites, regions, or device roles. In this scenario, whenever the NTP_IOS context is updated and merged to main, Netbox event rules trigger a webhook with this data payload to the Event Driven Ansible controller.

NTP Rulebook

The rulebook listens specifically for NTP webhooks from Netbox. When received it fires the rulebook activation to pass the contents of the webhook to the AAP workflow as extra variables.

---
- name: Listen for netbox events on a webhook
  hosts: all
  sources:
    - ansible.eda.webhook:
        host: 0.0.0.0
        port: 5001

  rules:

## NTP
    - name: NTP updates
      condition: event.payload.data.name == 'NTP_IOS'
      throttle:
        once_after: 15 minutes
        group_by_attributes:
          - event.payload.data.name
      action:
        run_workflow_template:
          organization: "Red Hat network organization"
          name: "netbox-ntp-drift-workflow"
          job_args:
            extra_vars:
              ntp_config: "{{ event.payload.data.data }}"
              _groups: "device_types_{{ event.payload.data.device_types[0].model }}"


How the Rulebook Works

  1. Evaluate the Rule:

    • The rules section defines the logic.

    • condition: The rule only proceeds if the incoming webhook’s data payload contains a name field with the exact value ‘NTP_IOS’. This ensures the rulebook only reacts to events related to this specific configuration context.

    • throttle: This setting prevents the rule from firing too often. It will only run this action once every 15 minutes for events with the same name (NTP_IOS), even if multiple updates happen in quick succession.

  2. Perform an Action:

    • If the condition is met, the action block is executed.

    • run_workflow_template: It tells the EDA controller to launch a specific workflow template named netbox-ntp-drift-workflow in Ansible Automation Platform.

    • job_args: It passes crucial data from the NetBox webhook into the workflow as extra_vars:

      • ntp_config: The actual NTP configuration data from NetBox is passed into this variable.

      • _groups: A dynamic inventory group is constructed (e.g., device_types_C9300-48P) to ensure the workflow runs only against the relevant devices.

NTP Rulebook Activation (EDA)

~~~

ansible_rulebook.websocket - DEBUG - Event received, {‘type’: ‘SessionStats’, ‘activation_id’: ‘1294’, ‘activation_instance_id’: ‘1294’, ‘stats’: {‘start’: ‘2025-06-30T19:11:06.241617660Z’, ‘lastClockTime’: ‘2025-06-30T19:31:08.044Z’, ‘clockAdvanceCount’: 12018, ‘numberOfRules’: 1, ‘numberOfDisabledRules’: 0, ‘rulesTriggered’: 0, ‘eventsProcessed’: 0, ‘eventsMatched’: 0, ‘eventsSuppressed’: 0, ‘permanentStorageCount’: 0, ‘permanentStorageSize’: 0, ‘asyncResponses’: 0, ‘bytesSentOnAsync’: 0, ‘sessionId’: 1, ‘ruleSetName’:

‘Listen for netbox events on a webhook’,

‘lastRuleFired’: ‘’, ‘usedMemory’: 5526512, ‘maxAvailableMemory’: 518979584}, ‘reported_at’: ‘2025-06-30T19:31:08.106108Z’}

~~~

Drift Playbook

Below is an excerpt from our drift check playbook, diff.yml. This example shows how we execute a specific role to validate the ntp_server configuration using Ansible’s stateful resource modules. This ntp_global resource module is set to checkmode to determine the diffs between the Netbox config context and the switch running configs.

---
- name: DIFF NTP_IOS config from Netbox SSOT
  hosts: "{{ _groups }}"
  gather_facts: false
  vars:
    ansible_network_os: cisco.ios.ios
    ansible_connection: network_cli

  tasks:

    - name: Check/Replace existing NTP for compliance
      cisco.ios.ios_ntp_global:
        config: "{{ ntp_config.ntp_ios }}"
        state: replaced
      register: ntp_diff
      check_mode: yes

    - name: Set Facts Before
      ansible.builtin.set_fact:
        fact_before: "{{ ntp_diff.before }}"
      delegate_to: localhost

    - name: Set Facts After
      ansible.builtin.set_fact:
        fact_after: "{{ ntp_config.ntp_ios }}"
      delegate_to: localhost
      when: ntp_diff.changed == true

    - name: Set Facts After Empty
      ansible.builtin.set_fact:
        fact_after: {}
      delegate_to: localhost
      when: ntp_diff.changed == false

    - name: Run DIFF check between before and after
      ansible.utils.fact_diff:
           before: "{{ ntp_diff.before }}" 
           after: "{{ ntp_config.ntp_ios }}"
      diff: true
      delegate_to: localhost
      
    - name: No NTP DRIFT
      debug: 
        msg: "No NTP DRIFT found for {{ inventory_hostname }}"
      when: 
       - after is not defined
       - ntp_diff.commands == []
      delegate_to: localhost

    - name: NTP DRIFT
      ansible.builtin.debug:
        msg:
          - "NTP DRIFT found for {{ inventory_hostname }}"
          - "The following commands need to be applied to {{ inventory_hostname }}"
          - "{{ ntp_diff.commands | default ('no commands')}}"
      when: ntp_diff.commands != []  

How the playbook diff.yml worked

Check for Drift (in Dry-Run Mode):

The first task uses the cisco.ios.ios_ntp_global module to compare the intended configuration (passed in via the ntp_config.ntp_ios variable) against the live device.

check_mode: yes is the most important part. It ensures Ansible does not apply any changes. It only calculates what would change.

The result, including the "before" state and a list of commands needed to fix any drift, is registered in the ntp_diff variable.

Process the Results:

The next several tasks (set_fact and fact_diff) are used to process the results on the Ansible controller (delegate_to: localhost). They prepare the “before” and “after” data for comparison and reporting.

Report the Findings:

If No Drift: The "No NTP DRIFT" task runs if the ntp_diff variable shows that no changes were needed. It prints a simple success message.

If Drift Is Found: The "NTP DRIFT" task runs if changes are detected. It prints a clear message stating that drift was found and, most importantly, displays the exact CLI commands that an administrator would need to run on the switch to bring it back into compliance.

Drift Playbook Output

The diff output below shows the results of our configuration drift check for NTP servers. The before “-” state is an empty dictionary because the live switch had no NTP servers configured. The after “+” state shows the intended configuration, which was passed in as structured data from NetBox through an Event-Driven Ansible webhook.

Following the diff, the next task prints the exact CLI commands required to apply this intended configuration to the switch and correct the drift.

--- before
+++ after
@@ -1 +1,12 @@
-{}
+{
+    "servers": [
+        {
+            "server": "192.0.11.1",
+            "version": 2
+        },
+        {
+            "server": "192.0.11.2",
+            "version": 2
+        }
+    ]
+}
changed: [clab-cat-leaf1 -> localhost]
TASK [NTP DRIFT] ***************************************************************
ok: [clab-cat-leaf1] => {
   "msg": [
       "NTP DRIFT found for clab-cat-leaf1",
       "The following commands need to be applied to clab-cat-leaf1",
       [
           "ntp server 192.0.11.1 version 2",
           "ntp server 192.0.11.2 version 2"
       ]
   ]
}

Bringing it all together

This three part blog demonstrates that managing network configuration drift is essential for maintaining a secure, compliant, and reliable network. The core principle is establishing a Single Source of Truth (SSoT) and using Ansible as the automation engine to enforce it. We explored a clear progression of automation maturity through three distinct demos. The journey began with a foundational approach using static, templated configuration files as the SSoT. It then evolved to a more robust and scalable method using structured data from Git repositories with Ansible’s idempotent resource modules. Finally, the post culminated in a fully automated, event-driven workflow, showing how a purpose-built SSoT like NetBox can use webhooks to trigger Event-Driven Ansible, enabling instantaneous drift detection and remediation. Ultimately, these examples prove that regardless of an organization’s starting point, Ansible provides a flexible and powerful framework to systematically eliminate configuration drift.

A Note on Validated Content

While Ansible Validated Content provides ready-made roles for drift management, we chose to focus directly on the resource modules in this post. The reason is simple: resource modules are the fundamental building blocks used within those validated roles. Understanding how they work provides a stronger foundation for all network automation.

Further Exploration

While not covered directly in this post, it’s worth noting that you can combine network resource modules with scoped Jinja2 templates. This allows you to create comprehensive diff checks that include configuration parameters not yet available in the resource modules themselves. Stay tuned for more on this topic…

Managing Network Config Drift with Ansible Part 1

Managing Network Config Drift with Ansible and Resource Modules Part-2