Does get_url prevent caching?

I am trying to download some ISOs to multiple machines (via a proxy to
conserve bandwidth).

The ISO is being stored by the proxy, and the machine is using the proxy, but
it is downloading from the upstream source every time. Squid is showing in its
logs:

1454462392.008 532579 192.168.122.10 TCP_CLIENT_REFRESH_MISS/200 632291702 GET
http://mirrors.kernel.org/centos/7.2.1511/isos/x86_64/CentOS-7-x86_64-Minimal-1511.iso - HIER_DIRECT/198.145.20.143 application/octet-stream

According to the squid docs:

TCP_CLIENT_REFRESH_MISS
The client issued a "no-cache" pragma, or some analogous cache control command
along with the request. Thus, the cache has to refetch the object.

Using a standard wget (or, say, yum to retrieve packages) does not cause
CLIENT_REFRESH_MISSes.

Is there something in the get_url code that is causing the sending of a no
cache pragma? Or maybe it's not turning off some default option in the
underlying urllib (or whatever it uses under the hood)?

j

While I’d be interested to know the answer to this as well, why don’t you just download the ISO to a local machine, then have your Ansible play grab the ISO from that machine?

I am trying to download some ISOs to multiple machines (via a proxy to
conserve bandwidth).

The ISO is being stored by the proxy, and the machine is using the proxy, but
it is downloading from the upstream source every time. Squid is showing in its
logs:

1454462392.008 532579 192.168.122.10 TCP_CLIENT_REFRESH_MISS/200 632291702 GET
http://mirrors.kernel.org/centos/7.2.1511/isos/x86_64/CentOS-7-x86_64-Minimal-1511.iso - HIER_DIRECT/198.145.20.143 application/octet-stream

According to the squid docs:

TCP_CLIENT_REFRESH_MISS
The client issued a "no-cache" pragma, or some analogous cache control command
along with the request. Thus, the cache has to refetch the object.

Using a standard wget (or, say, yum to retrieve packages) does not cause
CLIENT_REFRESH_MISSes.

Is there something in the get_url code that is causing the sending of a no
cache pragma? Or maybe it's not turning off some default option in the
underlying urllib (or whatever it uses under the hood)?

I'd say it's nothing to do with get_url or the fact ansible is involved. It's more likely to do with the configuration of squid, particularly http://www.squid-cache.org/Doc/config/maximum_object_size/.
The default is 4MB which is significantly smaller than the 600MB of the Centos ISO.