Hi guys,
I’m quite new to Ansible and would like to perform the following in an idempotent fashion. I would basically like to minimise the number of downloads I need to perform from the web for certain file assets, so my plan is to download them once to an asset server and then retrieve them from there for all the target hosts in our cluster.
As far as I know the flow would go something like this.
does file exist on target host
if yes exit
if no retrieve from asset server
does file exist on asset server
if not download from web to asset server
copy file from asset server to target host
Are there any modules I could use to do this or are there any other recommendations on best practices for this scenario?
Any help would be appreciated.
Cheers,
Jason
Sounds like you'd have an asset server role that gets executed against
the asset server that collects all the assets, followed by a role that
gets executed against "target host". Take a look at the get_url
module for getting files from the web to your asset server, and
get_url or synchronize method to pull them from your asset server to
your target host.
- James
Yep, I’d definitely recommend a local play to update things, or fetching from Jenkins (etc) versus the internet.
get_url also takes a SHA as an option to avoid repeated downloads (though you might … just might … be interested to moving to packages instead!)
the copy module will also check if the file exists on the target server and not copy it if it is already there… CAVEAT, if you intend to modify the file on the target server then it will be replaced with the original if any of these modules detect that it is different. (i.e it will checksum the two files and only if they don’t match will it copy the source over.
Adam
Adam
I’m always in favor of using the right tool for the job, is this not the sort of task at which rsync excels?
An rsync module perhaps?
the syncronize module is an rsync wrapper.
Thanks for the feedback guys. I think, as James suggested, the simplest way would be to download all the assets in a separate role.
That sayid, I was hoping to download / sync to the asset server in an addhock fashion in various roles. To do that I think it would be easier to write a custom module than piecing together a long list of calls to standard modules every time I need an asset. As an example, I’m thinking something like this would be good:
tasks
This would download the asset if if doesn’t exist on the asset server and syncronise with the local file system. The dest directory could be mirrored between the asset server and the host machine.
Custom module will work best in your case.
We are using the same approach but with more sources of asset. This module is basically used in every software installation role and separate role is not an option.
- Check if file is already present in destination directory
- Download from /opt/repository/app (mount of software repository for installations with Vagrant)
- Download from http://reposerver/app or s3://repobucket/app
- Download from internet
- name: Get Packer installer
get_installer: >
name={{ packer_installer }}
dest={{ packer_install_home }}
repo={{ repo_app_base }}
path={{ packer_repo_path }}
repo_url={{ repo_app_url }}
url={{ packer_url }}
sudo: yes
Where
packer_installer: ‘{{ packer_version }}_linux_386.zip’
packer_repo_path: HashiCorp/Packer/{{ packer_version }}
packer_url: https://dl.bintray.com/mitchellh/packer/{{ packer_installer }}
repo_app_base: /opt/repository/app
repo_app_url: http://reposerver/app (or s3://repobucket/app)
So module searches in
- /opt/install/packer/0.5.1_linux_386.zip (destination)
- /opt/repository/app/HashiCorp/Packer/0.5.1/0.5.1_linux_386.zip (nfs mount)
- http://reposerver/app/HashiCorp/Packer/0.5.1/0.5.1_linux_386.zip (http server) or s3://repobucket/app/HashiCorp/Packer/0.5.1/0.5.1_linux_386.zip
- https://dl.bintray.com/mitchellh/packer/0.5.1_linux_386.zip
Drawback - none of your roles can be shared in Ansible Galaxy.
Thanks Dmitry. Hosting it on s3 is a great idea. We may be able to use that with get_url as a short term solution. The download speed is the main issue and we get great speeds from s3.
Cheers