For anyone who has been provisioning ubuntu boxes with the ec2 module
it looks like you cannot safely update packages immediately after
provisioning when ssh becomes available (the standard pattern for us
has been launching a new server is to add a wait_for condition for ssh
and then add it to a host group).
The typical failure I see is something like this:
21:19:20 TASK: [launch_ec2 | Wait for SSH to come up]
John
Thank you for posting this. I confirm this to be a problem in ubuntu1204. Same problem exists for me with Windows 2012 instances. WinRm is connecting ok but cannot be used for at least 120 seconds (doing who knows what). Introducing delays or verification of required state seems to work. Yet another example is saving information back in chef. It takes about 60 seconds to save in Chef database so querying back for value being saved is a must to continue if other elements depend on them.
Peter
For new readers, we use a GIANT TON of 12.04 with Ansible, and haven’t run into this yet, but it’s quite possible our playbooks are a little different.
We’d like to see your update line for “configure instances” to make sure we’re on the same page.
We also don’t believe in having a lot of arcane incantations, like knowing you have to wait for a string in a file.
What might be interesting is to explore the root cause of why the cache is not yet available in cloud-init (is it a lock scenario because something else is happening, etc) and then we update the apt module to know how to wait – if it’s within reason, or at least capture and return a better error message.
This has happened to me recently with the latest 14.04 ubuntu AMIs (from ec2_ami_search). The fix was to just re-order the tasks being applied so we don’t immediately run “apt: update_cache=yes” but instead do some other environment/template tasks whilst waiting (hoping) for cloud-init to finish before apt is called, rather than adding the grep-polling on its log file.