How to include vars files in another vars file?

Hi,

I’m trying to create some sort of factory pattern I guess in my vars files for creating Amazon data pipeline jobs. What I want is to have a list of dicts, and each one will contain a parameter (“step”) that I want to be different per item. However, since this value is quite large & complicated, and will be used for 90% of the items, i don’t want to have to duplicate this value 40 or so times.

So, is it possible to include a vars file in another one? I tried using jinja2’s “include” function, but it complained that certain variables weren’t defined because it was trying to resolve the variables in the included file. In fact, those variables should be parsed later in the main play, now when including one vars file into the main one. I can’t use roles because this is part of a larger pattern in which the main vars file is loaded dynamically depending on another variable.

Some examples might make clear what I mean.

Here’s my playbook, “create-job.yml”:

- name: "Create a data pipeline and definition for {{ product }} {{ job }}"
  hosts: localhost
  gather_facts: True
  vars_files:
    - "vars/pipelines/{{ group }}/env/{{ env }}.yml"
    - "vars/pipelines/{{ group }}/{{ job }}.yml"

Here’s my vars file, “job1.yml”, for the “job1” job:

template: multiple-emr
startTime: 03:00:00
definitions:
- product: web_v2
  suite: websuite
  {% include default_step.yml %}         # how to include "default_step.yml"?
- product: db2
   suite: dbsuite
  {% include default_step.yml %} 
- product: custom
  suite: customsuite
  {% include custom_step.yml %} 
... x 40

And here’s the contents of “default_step.yml”:

s3_precondition: "/raw/data/#{node.myDate}/{{ product }}/"
step:
- "s3://my-bucket/artifacts/emr-jar-2.1.1.jar"
- "com.example.SampleEMR"
- "-Dinput=s3n://example-{{ env }}-data{{ s3_precondition | replace('node.', '') }}*{{ suite }}_#{myDate}.*.gz"
- "-Doutput=s3n://example-{{ env }}-data/intermediate/data/#{myDate}/{{ product }}/{{ suite }}/",
- "-DoutputFormat=json",
...

How can I achieve this with ansible?

Mark

There is no facility for a variable file including another.

Jinja2 is also not invoked when reading variable files at that time, so include won’t help.

Thanks for your reply Michael. Can you think of any other way I can achieve what I’m trying to? I have tried:

  • templating the vars file before loading it with include_vars, but jinja complained about undefined variables
  • doing a string replace into the vars file, but then the variables in the default_step.yml file weren’t interpolated when it was later loaded by ansible
  • trying to load the step by name in the json template I’m trying to create. I.e. for each product/suite combination I added ‘step: default_step’, then had another vars file containing all of the steps (e…g steps.default_step, steps.custom_step) and then in my json template trying “{{ {{ definition.step }}.step }}”, but jinja wouldn’t parse that.

I’m racking my brains but can’t think how I can do this.

Thanks

sounds like a very complicated process, what are you trying to do in
the end? it is normally simpler with ansible, it is rare to need
nested variable includes.

I’m trying to create JSON files that define Amazon data pipeline jobs. We have 40 or so different jobs that all need processing on EMR. There’s a lot of shared config between our tasks, with only the input paths and EMR step definition changing per job. However, since a lot of tasks share the same EMR config, I need the ability to reuse the same EMR config for multiple tasks, while also having the flexibility to override it per task.

So in the example I first gave, there are 2 jobs using the same default EMR config, and one that will use a custom config. As I say, it’s basically a factory pattern, with each job knowing which EMR config it needs, and the whole thing being templated generically.

Here’s a sample of the JSON template I’m trying to populate:

{% for definition in definitions %}
    {
      "id": "EmrActivityId_{{ loop.index }}",
      "name": "EmrActivity_{{ definition.suite }}",
      "precondition": {
        "ref": "PreconditionId_{{ loop.index }}"
      },
      "runsOn": {
        "ref": "EmrClusterId"
      },
      "type": "EmrActivity",
      "myDate": "{{ date | default('#{format(minusDays(@scheduledStartTime, 1), 

The rest of the template is identical for all jobs. So I'm really looking for a way to be able to populate the step definition per job so I can maximise reuse. As I say, there are around 40 of these particular tasks I need to migrate.YYYY-MM-dd

The rest of the template is identical for all jobs. So I'm really looking for a way to be able to populate the step definition per job so I can maximise reuse. As I say, there are around 40 of these particular tasks I need to migrate.)}') }}",
      "step": "{{ {{ definition.step_name }}.step }}"
    },
    {
      "id": "PreconditionId_{{ loop.index }}",
      "name": "InputExistsPrecondition",
      "s3Prefix": "s3://example-{{ env }}-data{{ definition.s3_precondition }}",
      "type": "S3PrefixNotEmpty"
    },
{% endfor %}

The rest of the template is identical for all jobs. So I’m really looking for a way to be able to populate the step definition per job so I can maximise reuse. As I say, there are around 40 of these particular tasks I need to migrate.

jinja2 has template inheritance which should work for you for the common parts

I can’t do this because all 40 jobs need to be in the same file so they’ll share EMR resources. So I need to use composition, not inheritance in order to create a single output file, not 40 output files.

jinja2 also has reusable blocks, just for that case

Please can you point me to the relevant part of the docs (http://jinja.pocoo.org/docs/dev/templates/). I tried using the {% include %} function and ansible returned an error saying that there was an error/it wasn’t allowed to call ‘include’.

Thanks

I was talking about blocks:
http://jinja.pocoo.org/docs/dev/templates/#block-assignments

but you can also use macros, not sure why includes are not working for
you, I had heavy use of them in previous job.

Actually I’ve got a different problem this time. Jinja is complaining that variables are undefined. Can you see why the included template doesn’t have access to the context?

{% block defs %}
{% for definition in definitions %}
      "test": "{% include 'sub/default.yml' with context %}",
{% endfor %}
{% endblock %}

And default.yml doesn't have access to suite or definition.suite:

suite={{definition.suite}}             // doesn't work for just {{ suite }} either - "definition" is undefined.

Thanks

Most of the times it is a misunderstanding, the context is the same as
passed to the initial template, variables defined in the template
itself are not available.