Calculating subsets of a dictionary variable for different inventories

We are running a small multi-tenant service, and have a large “sites” dictionary defined in group_vars that holds the configuration data for each tenant. For example:

  sites:
    demo:
      drupal:
        site_dir: demo.example.com
        site_name: Demo Site
      fedora:
        db_database: d_demo
        db_username: u_demo
    cust1:
      drupal:
        site_dir: cust1.example.com
        site_name: Customer 1 Site
      fedora:
        db_database: d_cust1
        db_username: u_cust2
    ….and so on

This dictionary is used in multiple roles and plays in the playbook. We have several environments — local development, staging, preprod, and production — and we don’t want every tenant provisioned in every environment. I think this would be best done with a variable in the environment’s inventory that listed the tenants:

  # production’s inventory
  …
  [all:vars]
  installed_sites=“demo cust1”

  # vagrant’s inventory
  …
  [all:vars]
  installed_sites=“demo”

I can’t quite work out how to make this happen, though. Ideas are:

1. Use the Jinji2 intersect operation (http://docs.ansible.com/playbooks_variables.html#set-theory-filters) to make a new “installed_sites_data” dictionary at the start of each play run, but that didn’t give the data deep in the “sites” dictionary…only the top level keys.

2. Use something like this with each play (pseudocode):
     with_dict: sites
     when: installed_sites is not defined or item.key in installed_sites
…but this seemed cumbersome because we would have to remember to add the “when” clause every time we used the “with_dict” statement.

3. Write our own lookup plugin (http://docs.ansible.com/developing_plugins.html#lookup-plugins) that only returns data from the sites dictionary that is appropriate for a particular inventory/environment. We don’t like this one because it feels un-Ansible-like — significant logic in the playbook would be hidden behind this plug-in.

Thoughts?

Peter