Can we stop Travis from breaking on unrelated files in module PRs?

Greg_DeKoenigsberg1 · March 31, 2016, 5:24pm

One of the nice new features we've added to the PR bot is the ability
to say "hey, this PR fails Travis! Therefore it needs revision.
Submitter, please fix." Saves us a lot of time.

But only if it's correct! Frequently, we'll see a PR fail Travis
because of an issue in an unrelated module. Here's an example:

https://travis-ci.org/ansible/ansible-modules-extras/builds/67623687

Brian_Coca · March 31, 2016, 5:34pm

This has to do with how github/travis branch testing is done, right now we test the PR which is based on a branch from a set point, not against the ‘current HEAD’.

If at any point we merge a commit that creates an error in travis or add a new test that does not pass, any PR made from that commit will have it, that is why you see the same error over and over.

The other issue is that PRs that are older than current HEAD might pass the tests from when they were submitted but not current tests, this lets them introduce new issues.

All this over time has contributed to a ‘snowball effect’, we see a PR we see the problem is unrelated and we merge it, but it causes a new problem which we did not detect due to the old one obscuring it (or in most cases, tests not existing at the time of the PR). This is now the major contributor to travis issues and very hard to fix as we have many PRs in queue that might or might not trigger it.

I think we can solve this by changing how we test on travis, not do it on the PR checkout but against HEAD with the PR commits cherry picked on top. Part of the “snowball” will still exist as this will only affect new PRs, but it will be limited to those in the queue and not any new ones created.

Greg_DeKoenigsberg1 · March 31, 2016, 5:37pm

This seems sensible. Is this a straightforward change, or an involved one?

--g

Brian_Coca · March 31, 2016, 5:40pm

Theoretical at this point, it is something that we discussed in core, James I think might know better what the scope and effort are to get this running.

It came up after the change to use the run_tests.sh script revealed the ability to just run arbitrary commands, which could include git updates to the code to do what I mention above.

Greg_DeKoenigsberg1 · March 31, 2016, 5:40pm

We could also exclude particular modules that we know are particularly
suspect to these breakages-in-time, correct? It really does seem to
come back to two or three particular breakages, and if that means
skipping Travis syntax checks on these modules for a while, I'm ok
with that.

--g

Brian_Coca · March 31, 2016, 5:44pm

it is not that a particular module is prone to breakage, it is just much more visible on modules that are popular and have a lot of updates and new features.

Greg_DeKoenigsberg1 · March 31, 2016, 5:50pm

clustering/znode.py is a popular module? o_O

It's the exact same error I've been seeing intermittently for a year.
I don't think I'm understanding what you're saying there.

--g

Brian_Coca · March 31, 2016, 5:55pm

i doubt we have tests for znode, so i’m going to guess this is a python compatibility issue? I don’t know the particular issue.

I was referring mostly to ec2 and other amazon modules that are also always presenting unrelated travis errors.

Greg_DeKoenigsberg1 · March 31, 2016, 6:03pm

i doubt we have tests for znode, so i'm going to guess this is a python
compatibility issue? I don't know the particular issue.

Yep. It's a broken 2.4 check that somehow keeps slipping in.

But: if you're telling me that we see these in the AWS PRs all the
time, then maybe the right answer is to turn off Travis checking in PR
bot until we've got new Travis code that adds cherry picked PRs to
Head.

Or we could change it to an alert: "just a note, @submitter, this is
currently failing travis, so you might want to check to see if there's
an issue" without actually changing the PR's state to needs_revision.

--g

sivel · March 31, 2016, 6:05pm

I’ve done some investigation on only testing changed files. I’ll look into this again.

Right now while we are still dealing with old PRs, some CI hasn’t run. As we close out those old ones over time this will be less likely assuming that we don’t merge PRs with failures.

Ask after fixing the other files you can restart a build in Travis that should clear things up.

Brian_Coca · March 31, 2016, 6:14pm

the problem with that is the issue can only be removed from the PR by rebasing, redoing the tests only gets you the old error as the branch from which the PR was cut is not the same as the one you pushed the fix to.

Jesse_Keating · March 31, 2016, 6:22pm

I’ve seen other CI systems that will run the tests against current HEAD (not the parent of the PR). Tests pass, reviews happen, time passes, and once everything is up to snuff, instead of just merging, a second round of tests happen to see if the change passes against a potentially NEWER HEAD, to avoid merging something that’s broken.

Generally means 2x tests per PR, but gates bad changes way better.

I have no idea of Travis can do this for you.

sivel · March 31, 2016, 9:42pm

znode.py is in currently the travis exclusion list for python2.4. However people seem to be submitting PRs from repos that haven’t been updated in a while.

znode.py was added to the exclusion list nearly 6 months ago (https://github.com/ansible/ansible-modules-extras/pull/1048)

We should probably also do better at informing people of best practices for submitting pull requests.

Rene_Moser · April 1, 2016, 6:27am

Some PRs are very old, 6 months is quite a while.

We should probably add a policy to the bot, that says "if older than 6
month, then needs rebase".

Regards
René

Brian_Coca · April 1, 2016, 2:03pm

Asking people to rebase when we might not be near being able to review their PRs will just add to frustration.

I would add this to the list of things to consider when merging, if branch is very old ask for a rebase.

Michael_Baydoun · April 1, 2016, 5:18pm

I ilke the idea of Travis doing an automatic rebase before running tests

Topic		Replies	Views
Want to test PR's but baffled by git submodules Ansible Developer	4	0	December 28, 2015
PR review? (Bug fix for template relative path references) Ansible Developer	1	0	July 14, 2014
Status report: ansible-modules-extras pull requests Ansible Project	1	0	June 9, 2015
The PR Bot -- currently for modules only Ansible Developer	2	0	February 22, 2016
New to Ansible Development Ansible Developer	1	1	May 15, 2015

Can we stop Travis from breaking on unrelated files in module PRs?

Related topics