I see from the above that you said 50 seconds above and I misread. In your case this is definitely slower than the actual command by a very decent margin. I’m still not seeing this.
If you can benchmark where it is spending it’s time that would be appreciated.
I noticed you were installing fastest-mirror though, which you probably don’t want to do
I’ll remove fastest-mirror, it indeed looks like it made things slower (this is in fact what I was adding to my stack as an experiment to make YUM faster - at first I thought it was purely YUM-related issue).
I will try to find some information as to how to benchmark, but would you have any recommendation as to how I should proceed?
./hacking/test-module in the checkout is pretty useful for things like this.
Do a checkout on a machine with yum and even inserting some basic print statements or logging could be a useful start to find out what functions or commands are taking the most time.
Quick note. My playbooks break if I do not have repoquery… the code seems to suggest this is optional, but I just found a case, for instance, where checking for an already installed package gave me a recursion error, while another fresh install failed on “failed to parse: SUDO-SUCCESS-whatever”
I am away from my Ansible machine and test, however in my playbook the first thing i do is update yum, and yum-utils to the latest update as i had similar issues with older releases.
We’d have this discussion before, where yum-utils we were pretty sure was only excluded in @core installs. That might not b e true though – need to check.
I have no problem making the yum module self-add yum-utils if not already there if it resolves problems in those environments as it should be there anyway.
I’m manually adding yum-utils in my RedHat installs as I am performing a minimal install. I figured that this was my fault for trying to install as little as possible. It might make some sense to document that dependency in the yum module page though.
I just happen to add some crude log traces to my yum module last night to see if I could figure out what it’s doing.
On RPMs that are already installed it will use up all the CPU/IO for a while, on a small instance it can take a long time. The instance I was testing with was an m1.small, so it’s slow anyway, but for just testing if an RPM is already installed, it’s pretty intense. The what_provides() appeared to be the worst, however I didn’t log the exit time of the function to get a good measurement. I’m also not sure why it would need to call that if I just gave it an RPM name instead of a path to look up. This RPM from an onsite repo cache, and we do run "yum clear all” before hand…
@cove_s nice I didn’t get to go down that much, but that reflects pretty well what I am experiencing.
@Adam@Michael at least for updates, NOT using repoquery made things faster for me. What I did is change the code for the yum module to undefine the repoquery path.
I tried a few things still to make it perform better, including mirror repositories, but the fact that repoquery is forced on the user is perhaps limiting… any ways to make that optional instead of using it if it is present?
We’ve been through this discussion a bit before, and we believe the repoquery needs to be there.
I’m a bit more curious about why you are spending so much time in the operation and most people are not.
When using yum in any sort of important setup, I almost always create a yum mirror with reposync, etc, and even in our testing, we’re not seeing any major timing issues with the yum options at all.
yum_rhn_plugin can sometimes be a very very different story (hence even more reason to mirror content).
For yum, I disable fastestmirror, set hard-coded repo sites, then configure an http_proxy.
For apt, I set hard-coded repo sites, then configure an http_proxy.
This seems much lighter weight then cloning an entire OS distribution, when most packages aren't going to be installed anyways.
ps: if you leave fastestmirror enabled, then the download site will change randomly, so a proxy is worthless. Also, the centralized site that fastestmirror talks to seems to be highly unstable, and returns spurious errors, which cause the ansible yum module to abort, but only sometimes. This isn't a bug in ansible, but in the yum python module that ansible uses.
“This seems much lighter weight then cloning an entire OS distribution,”
It’s much smaller than the apt repo, however.
The other bonus is being able to control the package versions on all of your hosts and update when you choose while still coding only state=latest in the ansible content.