Thoughts, experiences and ideas on usage of LLMs or specialized AI models for Ansible validation

Hi all. I would like to share some issues I’ve been dealing with recently and would like to hear you experiences, ideas and thoughts. Bare with me, this will be slightly longer post.

The issue revolves around usage of LLMs or possibly specialized AI models (if they exist) in validation, compliance enforcing and error correction of Ansible code and other input data. There is a predominant understanding, especially among higher management, that modern AI tools can solve most of the tedious manual human error correction tasks if you just feed it with all of the data and give it instructions on how to “sort this out”.

So here is my example. Let’s say we have around 350 Ansible projects. Projects have a predefined structure of directories for collections, roles, group and host vars, inventory and playbooks. Each project describes one setup consisting of a number of VMs and services deployed to them. There are predefined rules for project and VM naming, required inventory groups, group naming and group hierarchy. We currently rely on human input to correctly define inventory data including VM naming, group membership and other inventory data in general. As it can be expected, we encounter a lot of subtle human made errors, inconsistencies, typos, ordering issues, collisions (two VMs with the same name for example) etc.

Since number of projects are increasing over time and human made errors are piling over time, it is becoming challenging to keep an overview of all of the projects and thousands of VMs and said errors are increasingly becoming a cause of all kind of issues.

That being said, what AI powered tools are out there that could possibly ingest all this data and “sort this out”? Do you have any positive experiences?

My understanding is that for general purpose LLMs, token input limit would be the first obstacle. If I wanted to let LLM only to deal with inventory data, that would be around 1 MB of data (300k tokens roughly). The next issue would be that with this amount od data, LLMs will quickly loose comprehension and start to deviate, make errors itself and hallucinate.

1 Like

So to add to this (one sided :smile:) discussion, here are some thoughts and experiences of my own with ChatGPT. You can make it ingest large amount of data as an archive to a Jypiter environment/notebook. This data does not go trough language model because it would break the token limit. On the other hand, language model can write python snippets to do the refactoring on said data based on rules input into language model. In this way, no limits are applied. In other words, this can be described as glorified “grep” and “sed” runner.

Since data does not go trough the language model, you cannot tell it to try to “understand” the data and infer some meaning, rules, principles etc. present in the data. It can only make complex scripts to do a “search and replace”. For that, you have to specify very precise rules to apply. It’s basically the same as asking it to generate python snippets and then just run them on your dataset locally (your computer/server).

Simple stuff like “please find me any typos” cannot be done for non dictionary words and other special strings. I mean, for this specific request, model could possibly create a complex python script that does some statistical analysis of words and statistically find what could possibly be a typo because of how often a correct word is found compared to a word with a typo, but… that’s a stretch.