I’ve developed a partial auditing function in another in-house orchestration tool that I also wrote and am thinking of porting and completing this feature for ansible.
Basically, auditing is a means by which we’re able to detect all customizations made to a Linux system and report changes that are not orchestrated, even if no playbook covers the changed component. For example, checkmode in ansible is really great to detect when manual changes are made to a known config file that is covered by a playbook, and ansible is easily able to put the config file back to its intended state, but as soon as a change is made outside of ansible’s playbooks’ scope, this goes undetected. This includes additional packages that may have been installed, added users or filesystems, etc. This is a pain when some sysadmins are not diligent is putting everything in playbooks.
Auditing makes sure that all customizations are fully orchestrated, or at least flag manual changes, which gives the assurance that we can deploy the same or another similar system without surprises or misses. It is particularly useful for systems that have already been deployed but never orchestrated but we now want to orchestrate, for partially orchestrated systems that still need some work, and even for fully orchestrated systems that have for some “mysterious” reason had some components manually added or modified ;-). Auditing differs from an IDS by the fact it uses the operating system itself (plus the orchestrator) as a baseline instead of an IDS-taken snapshot, so it can detect all changes made to any system even if they never had an IDS installed. Plus, contrarily to an IDS, it also understands changes that were made by orchestration and doesn’t flag them as exceptions.
Right now I’ve made auditing work for RHEL-based Linux systems using the installation kickstart, rpm’s and the (in-house orchestrator’s) configuration. It is partially based on the rpm’s contents and inventory to detect changes made to the base operating system, and it cleans out dependencies really well (for example, it runs the rpms’ dependencies chain to report only the top packages that were side-installed instead of just dumping the list of all extraneous packages). It’s also smart enough to discard insignificant changes (comments added or removed, reformatted statements in config files) and understand that added or modified files are part of a package, not a manual modification. Unfortunately it is also tightly bound to the in-house orchestration tool, so my intent is to rewrite this to work with ansible and publish it open source. Although this may sound as a big undertaking, I was surprised at how easily I got this working the first time, so I’m hoping it will be the case the second time around.
Is this a feature that ansible had already plans for? If yes, I’d like to see if that could simply replace my current auditing tool; I wouldn’t want to start reinventing a wheel somebody else is already working on, LOL!
“Right now I’ve made auditing work for RHEL-based Linux systems using the installation kickstart, rpm’s and the (in-house orchestrator’s) configuration. It is partially based on the rpm’s contents and inventory to detect changes made to the base operating system”
Yep, this seems like basically comparing the package database and what isn’t in the package database. I’ve written one of these before (not against RPM, long story) and also have some previous patents on some proposals in this area (mostly based on Func, not all implemented…).
I don’t think this is something that needs to be in core ansible, but if you make a system diff tool that uses ansible modules to carry it out, I’d be very interested in seeing it. It might be something we could include in the examples/ directory as a useful CLI tool.
I think these things ultimately become “system diff” type tools, where you need a baseline system to compare them with, and mostly you get a list of these things that are different… which can be an unwieldy problem at scale – what’s good drift vs bad drift, etc.
“Yep, this seems like basically comparing the package database and what isn’t in the package database.”. Actually, the strength of the auditing tool is to use directly the configuration information of the orchestration software to determine if changes are actually orchestrated, which goes beyond rpm checking. For example, I was able to detect that certain filesystems were added to /etc/fstab or certain users were created on the system without being part of the orchestration. That was tightly coupled with my orchestration tool (which had both “check” and “audit” modes), so doing that with ansible, which has a slightly different approach, is obviously another challenge.
"I think these things ultimately become “system diff” type tools, where you need a baseline system to compare them with, and mostly you get a list of these things that are different… which can be an unwieldy problem at scale – what’s good drift vs bad drift, etc.". I had actually designed the audit feature precisely because diff tools are too cumbersome and spew out volumes of data to filter through. Audit actually pointed out exactly what was not yet orchestrated and it didn’t need a baseline system, which was really nice.
I do understand this is obviously not a small undertaking, and I don’t want to mess with ansible’s evolution, you guys already have plenty to do! I posted this just to see if the feature was in the pipeline somewhere or you guys might want to do this together. I checked your patents and don’t think the feature collides with anything, but I’ll make sure it doesn’t.
I’ll be at AnsibleFest in NYC if you ever want to explore this further, but I’m sure you’ll otherwise be plenty busy with lots of other more important things there :-).
Cheers,
Guy
“Actually, the strength of the auditing tool is to use directly the configuration information of the orchestration software to determine if changes are actually orchestrated”
Hmmm …
It seems it would be pretty hard to figure out what packages might be installed in any sort of sane programmatic way – dependency graphs, current state of PyPi + gems, etc. It’s kind of halting-problem-esque.
Even if you could parse ansible with the same logic as ansible itself (as would be required to fill in all the variable values), it would be very difficult to determine all the side effects.
I wrote some modules that audit specific subsystems on hosts. While
they use data from inventory, I don't think they quite tied into
configuration/orchestration the way you describe below. The modules did
things like: verify all packages are signed, check processes/binaries
that are listening for network connections, the typical filesystem
checks, and other things. There is an element of 'system diff' to these
modules and there's plenty of room for improvement in what I wrote.
That said, I think ansible provides a great framework to build modules
to audit specific subsystems.
Regards,
sf
Guy Sabourin <gsab.go@gmail.com> writes:
Hi Stephen,
That’s great! Are the modules you wrote published or available? It would be nice to see if auditing, or at least a part of it, could be done within your existing modules, perhaps as an extension to them. The challenge is to port the feature to ansible. Although I have a couple of ideas of how this could be done, I’m not yet sufficiently “intimate” with ansible’s internal workings to be sure the best way to do this, and I’d like to take the cleanest route possible. It looks like you have part of the work already done, so it would be nice to see how it could fit in with broader auditing. Would you also be open to collaborating a bit on this? I’m not sure how my auditing fits in your priorities though, so let me know.
Thanks!
Guy
I understand your scepticism. It’s no small undertaking. However I also thought it would be a huge challenge with my previous orchestration tool, and it ended up being a lot easier to do when I dug into it. Doing the same thing with ansible is of course different, since I’m not yet intimately familiar with its internal workings to figure out the best approach.
Thanks!
Guy
Guy Sabourin <gsab.go@gmail.com> writes:
Hi Stephen,
That's great! Are the modules you wrote published or available? It would be
nice to see if auditing, or at least a part of it, could be done within
your existing modules, perhaps as an extension to them. The challenge is to
port the feature to ansible. Although I have a couple of ideas of how this
could be done, I'm not yet sufficiently "intimate" with ansible's internal
workings to be sure the best way to do this, and I'd like to take the
cleanest route possible. It looks like you have part of the work already
done, so it would be nice to see how it could fit in with broader auditing.
Would you also be open to collaborating a bit on this? I'm not sure how my
auditing fits in your priorities though, so let me know.
You can find them here:
https://github.com/sfromm/ansible-playbooks/tree/master/library
I personally like using modules to implement auditing capabilities since
it requires no changes to ansible itself. In the past, I had looked at
implementing a module that used SCAP but eventually punted due to lack
of time and the complexity of implementing SCAP.
Comments/feedback welcome, but should probably be taken off-list. ![:slight_smile: :slight_smile:](/images/emoji/twitter/slight_smile.png?v=12)
sf
I’ve looked at your modules and they do a very good job of checking the filesystem! You have a more “security-oriented” approach which is a really great complement to what I’m trying to do (installation auditing).
I’m going to start playing with your ‘filesystem’ slowwwly (I’m quite busy with other projects) to see how far I can get staying inside a module, there’s an rpm part of the job which should fit in quite well. I do agree that the best is to do the job in a module, I’m just not certain how far I’ll be able to get with that route, but it’s really worth the effort. In all cases, your filesystem module covers really nice areas, so it’s a keeper for me.
I’ll use github or private email for further comments/suggestions etc.
Thanks!