s3 module with checksums for multipart uploads

Hi,

I'm having an issue with the s3 module in ansible 1.3.3:

"Files uploaded with multipart of s3 are not supported with checksum,
unable to compute checksum."

The task definition looks like this:

action: s3 bucket=my_bucket
               object=/foo/bar.txt
               aws_access_key=xxxx
               aws_secret_key="yyyy"
               dest=/opt/bar.txt
               mode=get

So the initial GET works, but any subsequent get fails, presumably due
to the checksum issue. Am I maybe doing something wrong?

Cheers,

Ben

$ ansible --version
ansible 1.3.3 (release1.3.3 291649c15d) last updated 2013/10/23
10:06:59 (GMT +100)

Can you share what the error your receiving back is?

Hi James,

Please find the -vvv output from ansible below.

Cheers,

Ben

TASK: [Get all of the packages from S3 onto the box] **************************

<cdr1.aws.acme.com> ESTABLISH CONNECTION FOR USER: root

<cdr1.aws.acme.com> EXEC ['ssh', '-tt', '-q', '-o',
'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o',
'ControlPath=/Users/0x6e6562/.ansible/cp/ansible-ssh-%h-%p-%r', '-o',
'Port=22', '-o', 'KbdInteractiveAuthentication=no', '-o',
'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey',
'-o', 'PasswordAuthentication=no', '-o', 'User=root', '-o',
'ConnectTimeout=10', 'cdr1.aws.acme.com', "/bin/sh -c 'mkdir -p
$HOME/.ansible/tmp/ansible-1382670810.23-62877702454141 && echo
$HOME/.ansible/tmp/ansible-1382670810.23-62877702454141'"]

<cdr1.aws.acme.com> REMOTE_MODULE s3 bucket=cdr-deployment
object=/3rd-party/jdk-7u45-linux-x64.gz aws_access_key=xxx
aws_secret_key=xxx dest=/opt/downloads/jdk-7u45-linux-x64.gz mode=get

<cdr1.aws.acme.com> PUT
/var/folders/z8/n1d02m0j26z6wv2p_793ls4m0000gn/T/tmpUDmKUP TO
/root/.ansible/tmp/ansible-1382670810.23-62877702454141/s3

<cdr1.aws.acme.com> EXEC ['ssh', '-tt', '-q', '-o',
'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o',
'ControlPath=/Users/0x6e6562/.ansible/cp/ansible-ssh-%h-%p-%r', '-o',
'Port=22', '-o', 'KbdInteractiveAuthentication=no', '-o',
'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey',
'-o', 'PasswordAuthentication=no', '-o', 'User=root', '-o',
'ConnectTimeout=10', 'cdr1.aws.acme.com', "/bin/sh -c
'/usr/bin/python2
/root/.ansible/tmp/ansible-1382670810.23-62877702454141/s3; rm -rf
/root/.ansible/tmp/ansible-1382670810.23-62877702454141/ >/dev/null
2>&1'"]

failed: [cdr1.aws.acme.com] => (item=jdk-7u45-linux-x64.gz) =>
{"failed": true, "item": "jdk-7u45-linux-x64.gz"}

msg: Files uploaded with multipart of s3 are not supported with
checksum, unable to compute checksum.

Has there been any resolution on this issue? I am running into the same problem and uploading the files as single part versus mulitpart to S3 is not really an option.

Thanks,

Chris

I don’t believe so, however you might try the workaround suggested in this github issue:

https://github.com/ansible/ansible/issues/5442

So I’ve dug into this today, and the short answer is: it’s possible, but difficult to calculate the MD5 sum of multipart uploads. The main issue is that the size (in MB) of the parts uploaded needs to be known in advance, however Amazon throws that information away once the multipart upload is complete. We could try and guess, however that would take multiple passes and could be very slow (imagine calculating the MD5 sum for 100 parts of a 10GB file 5-10 times - that’s a lot of disk activity and time for a single file). Also, wrong guesses would mean people could be incurring additional S3 fees unnecessarily, which is obviously something we would want to avoid at all costs.

So, for the foreseeable future, the s3 module will have to remain dumb about multipart uploads. The primary workaround is as suggested above, to re-upload the file without using the multipart upload feature.

As Chris mentioned, uploading files without multi-part was not an option.

Having recently run into the same issue with the same constraint, I wanted to suggest an alternate workaround of simply renaming or deleting the file currently on disk which will prevent the s3 module from attempting to compute the checksum. This only works for GET operations.