Ok, let me chime in as this is something we've been brainstorming about for some time now.
If you have more than one source for an inventory, and you want to allow to collect information (facts) from different sources, and even consider sources that have overlapping information, we need a different solution.
Essentially we want to have inventory scripts that are generic so that they can be reused (instead of being customized per environment).
The idea that we've come up with is that every source has uniquely identifiable information: a hostname, a uuid, a macaddress, (possibly an ipaddress) or even an asset-tag. If different inventory-script gather information of hosts and one of these uniquely identifiable attributes match we can merge the information from these sources and consider the entries to be from the same machine.
This would work for various existing inventories in companies:
- Asset database
- CMDB
- Cobbler
- Public/Private cloud(s)
- VMWare vSphere cluster(s)
- RHEV manager(s)
- Red Hat Network
- Network inventory information
- DNS
- System information
- HP iLO interfaces (sort of)
And by merging, we have a single inventory for everything.
(But it may not be sufficient for facts that are stored per host in a separate location like HP iLO where each iLO 'device' is unique to one system.)
If we can gather most (if not all) facts from different sources in a single phase, we're not there yet. We still lack groups, which is another part of the current inventory that is very specific to invididual users. However, most (if not all) groups are determined by facts.
Besides, chances are you don't need one-dimensional groups, but you like to group based on more than one facts (e.g. configure facts based on location, environment _and_ tier). Currently this requires modifying the inventory script to make it do what you need, which makes those customized inventory scripts not generally useful.
However if we can describe (in a yaml file) how groups needs to be set based on facts in a second phase (after the merge), we abstracted this part out of the scripts too. And this could include multi-dimensional groups too (e.g. dc1-prod-tier1), and possibly syntax for unions and intersections to dynamically make your pick based on facts.
Anyway, the above is quite powerful for large to midsize environments and could co-exist with the current host_vars and group_vars for adding Ansible facts in phase1 and phase2.
What would be needed is that each inventory script agrees on a set of server attribute names (especially attributes storing the same value) and conforms to a specific value (e.g. macadresses are lowercase with colons).
If attributes with the same name hold the same value, they are considered equal and are merged (or we could make those attributes configurable), in case there are conflicting attributes in matched hosts we could escalate or decide precedence rules. This is all up for discussion.
And to make this workable, caching will be needed too (which is already the case for some of the existing inventory scripts).