PEP8 compliance in codebase

Hey, Our CI system runs pep8 and pyflakes separately on every commit
as well and it's been something I've wanted to work on cleaning up
since I discovered it. However, v2.0 has made it harder for me since
time has become more of a premium so your help would be welcome.

Not everyone here is onboard with all of the pep8 style suggestions
(the one I hear most frequently is 80 column lines) so i think the
best thing to do is probably take individual (or a small number) of
flake8 warnings and work on fixing those throughout the code base (can
be one or several pull requests depending on how you best work) and
then move on to the next one. That way you'll know that everyone
here is fine with that particular style change before you spend time
fixing the code for it.

Glancing briefly at current pep8 warnings, I see that four space
indent and mixing tabs and spaces are probably low hanging fruit that
would get the numbers down and make it easier to see what else needs
fixing.

Spaces around operators, brackets, braces, and parenthesis probably
should be submitted separately from indentation but also affects many
files.

Deprecations (like .has_key()) should be acceptable to everyone.

multiple statements on one line is a personal pet peeve of mine but
there's not too many of those.

pyflakes shows somethings that are actual errors that need fixing
(undefined name errors - although many of those are false positives
inside of modules. modules are currently concatenated with snippets of
generic code so some things are defined in the generic code).

If you have some personal favorite style bugs that you'd like to
concentrate on, mention them here so that we can get everyone on board
with the style fix and then you can get busy.

-Toshio

So we have some disagreement here, I would hold off on any style PRs.

I find that style updates w/o having enforcement first will just be a
waste of everyone's time as code will drift from the style as we
accept PRs.
It also breaks attribution and makes it harder for us to look at
why/when was a code made in such a way, specially when judging new
changes against that code.
Aside from forcing many rebases for style reasons, not because bugs
got fixed or features added (which already cause many PRs to need
rebases).

Considering our current backlog and issues before us, I really don't
want to spend time in minor style changes nor reviewing them.

If some code is so bad as to be a great maintenance burden, I would
consider a PR, for the rest I think we can live with extra or missing
spaces around operators.
I think we should look at style enforcement way before we even think
about allowing for style only PRs and only after we have dealt with
the many other issues I consider higher priority than this.

I think that some style updates that can also have impact on the code
should be added even now. For instance, indentation probably should
be accepted now as non-four space indent and mixing tabs with spaces
may have undesirable effects.

I do see that attribution is lost but on the other hand it's better to
have style changes made in a "style PR" than to have style fixes mixed
in with someone else's actual code changes because they had to change
the style of unrelated sections so they could read and understand it.
That practice hampers both attribution and understanding of the actual
meaningful changes.

Enforcement is great and I'd love for us to have that. But
enforcement is easy if you have a clean base (run checker -- if output
is clean, commit) and hard if you have a base that already fails 2000
tests (run the checker on upstream. Record the errors. Run the
checker on the code with the PR applied. Compare errors and decide
whether any of the errors are new).

-Toshio

If it affects how code works, I do not consider it a style change it
is a bugfix.

If they change unrelated sections as part of a fix, I consider that a
style change, if the section was so bad it was unreadable, that is an
acceptable change as I stated above.

For the enforcement I agree, it s a lot easier with a clean base, but
it is useless to cleanup the base and then have a thousand PRs which
can 'dirty' it again. Without having an enforcement in place or to be
put in place right after the style cleanup, it really makes little
sense to me to go down this road.

Aside from the attribution issues, which I don't see a way around, in
many cases the nature of our code makes a lot of the linting and style
checks give out false positives, specifically with modules. One
solution would be to ignore certain types of issues, but that will
also hide the actual real positives.

I'm not against having a style guide and enforcing it, I just see pure
style PRs as near useless at this point, if not counter productive.

Also I think we should be putting our efforts elsewhere right now.

Some style changes don't affect code now but they can have an adverse
impact in the future. Indent falls into this category. For a
language that defines blocks via indentation, python is very lenient
about indentation so many things work but later code additions make
those indentation levels SyntaxErrors or logic errors.

Really a productive thing to be doing here is figuring out if there's
any help that people can give in this area that we would accept rather
than hand waving about sometime in the future... So here's a few that
i think we might be able to agree on from our other conversations.

* non-4 space indent. non-4 space indent leads to SyntaxErrors later
when more code is added to the block. For example:

def test():
   if True:
        do_this()
+ elif True:
+ do_that()

When the elif is added at the proper 4 space indent, the 3-space
indent of "if True" makes it a SyntaxError.

* mixed tabs and spaces: In python, tabs are equivalent to 8 space
indent so tabs and spaces can be mixed and the interpreter will do
something with it. however, in editors, tabs are not limited to 8
spaces. A good many programmers have it set to display 4 spaces as
that's the indentation level that they expect when they hit the tab
key on their keyboard. reading a file that has mixed tabs and spaces
is insanity in this case. Writing a file that is currently all tab
can be dangerous if you don't realize that the existing indentation is
100% tabs. You may add four space indent that looks right since your
editor is displaying tabs a four spaces but you're really putting
everything at a different indentation level.

* A github hook or tie in to travis that does the enforcement you are
after. It would need to compare the style errors in the parent commit
to the style errors after the commit is applied and then fail if the
commit has new errors in it.

-Toshio

I agree on all points.

After reading all of this, I am still confused on what is the final agreement, I am planning to send some new PR’s and I am confused on where I should fix style within the same PR or issue a separate one, not fixing the wrong style at all and leaving it alone sounds to me like a crime.

VM

We do not accept style only changes. Style fixes should only be made, and done so minimally, on lines where you are modifying code as a result of a new feature or bug fix.