Hi all. I would like to share some issues I’ve been dealing with recently and would like to hear you experiences, ideas and thoughts. Bare with me, this will be slightly longer post.
The issue revolves around usage of LLMs or possibly specialized AI models (if they exist) in validation, compliance enforcing and error correction of Ansible code and other input data. There is a predominant understanding, especially among higher management, that modern AI tools can solve most of the tedious manual human error correction tasks if you just feed it with all of the data and give it instructions on how to “sort this out”.
So here is my example. Let’s say we have around 350 Ansible projects. Projects have a predefined structure of directories for collections, roles, group and host vars, inventory and playbooks. Each project describes one setup consisting of a number of VMs and services deployed to them. There are predefined rules for project and VM naming, required inventory groups, group naming and group hierarchy. We currently rely on human input to correctly define inventory data including VM naming, group membership and other inventory data in general. As it can be expected, we encounter a lot of subtle human made errors, inconsistencies, typos, ordering issues, collisions (two VMs with the same name for example) etc.
Since number of projects are increasing over time and human made errors are piling over time, it is becoming challenging to keep an overview of all of the projects and thousands of VMs and said errors are increasingly becoming a cause of all kind of issues.
That being said, what AI powered tools are out there that could possibly ingest all this data and “sort this out”? Do you have any positive experiences?
My understanding is that for general purpose LLMs, token input limit would be the first obstacle. If I wanted to let LLM only to deal with inventory data, that would be around 1 MB of data (300k tokens roughly). The next issue would be that with this amount od data, LLMs will quickly loose comprehension and start to deviate, make errors itself and hallucinate.
So to add to this (one sided ) discussion, here are some thoughts and experiences of my own with ChatGPT. You can make it ingest large amount of data as an archive to a Jypiter environment/notebook. This data does not go trough language model because it would break the token limit. On the other hand, language model can write python snippets to do the refactoring on said data based on rules input into language model. In this way, no limits are applied. In other words, this can be described as glorified “grep” and “sed” runner.
Since data does not go trough the language model, you cannot tell it to try to “understand” the data and infer some meaning, rules, principles etc. present in the data. It can only make complex scripts to do a “search and replace”. For that, you have to specify very precise rules to apply. It’s basically the same as asking it to generate python snippets and then just run them on your dataset locally (your computer/server).
Simple stuff like “please find me any typos” cannot be done for non dictionary words and other special strings. I mean, for this specific request, model could possibly create a complex python script that does some statistical analysis of words and statistically find what could possibly be a typo because of how often a correct word is found compared to a word with a typo, but… that’s a stretch.
This is a very interesting topic. You mentioned you’ve used ChatGPT. IIRC recently OpenAI reduced the 1M context window for some of the models down to 258k maximum: Reddit - Please wait for verification
With the OpenAI models and the context window changes I think you’d be unable to ingest your entire inventory. Have you explored using any Anthropic models via Claude which have a 1M context window such as Sonnet 4.6 (1M context) or Opus 4.6 (1M context)?
I think a good starting point would be to help your agent(s) with project context. Have you tried creating an AGENTS.md or CLAUDE.md in each project which outlines each project’s VM naming conventions, inventory groups hierarchy etc.
WTF. I did not say that Bot answer or some malware on your device?
To comment the rest of the answer, my original post is over one year old now. AI tools have progressed in the meantime and could possibly be of greater help but we solved the original problem… unfortunately manually.
There are two issues when trying to process projects one by one with some AI tool:
Just by looking at the inventory file, I can right away spot the errors and inconsistencies because I keep much of the data in my brain. It does not take me much time to fix errors in a single inventory file either. When doing it with an LLM, you have to wait for it to churn all the tokens and propose the changes which I have to review any way and accept. It effectively does not save me any time but increases the cost.
It cannot spot errors across projects like naming collisions and other global inconsistencies.
@dbrennand As I already mentioned, the original problem is now solved so large context windows came a little late for us. In the meantime we also made rules for inventory file formating and their content much more strict so that less errors creep up. We also made AGENTS.md and CLAUDE.md with AI agent specific instructions on how to guide the user/engineer when implementing the project, including the inventory file content. An AI agent now acts as the control mechanism. Not completely perfect but a step forward in the right direction.
I guess it is a(n AI-generated) spam post, trying to add more reputation to a spam link by attributing it to a frequent poster like you. I’ve flagged the post.