Page MenuHomePhabricator

Introduce Python code formatters usage
Open, MediumPublic

Description

Context: during a discussion at Infrastructure Foundations team meeting the subject of tools automatically (re)formatting Python code came up.
This task is to track the rationale for having such tools for (new) Python code and implementation.

Pros:

  1. Authors don't have to think about code formatting
  2. Code formatting comments no longer required during code reviews
  3. Code readability is the same across authors

Cons:

  1. Local setup needed: formatter installation and editor integration at least
  2. Similar setup needed for integration with CI to fail if code isn't formatted as required
  3. If/when the code formatter output changes e.g. across upgrades we'll have to adjust for that

Open questions:

  1. Which formatter(s) to use?
  2. Require standard formatting for old and new code or only new or a mix depending on the codebase?
  3. How much local setup is needed depending on the formatter(s) and the editor(s)?
  4. Likely other questions

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 12 2018, 9:54 AM
ema triaged this task as Medium priority.Dec 12 2018, 10:48 AM
ema added a subscriber: ema.
Eevans added a subscriber: Eevans.Dec 12 2018, 3:56 PM
crusnov added a subscriber: crusnov.EditedDec 12 2018, 4:30 PM

Just to throw in, since this was discussed perhaps 2 weeks ago on the meeting, I installed Black with the width settings we normally use for Cumin and have been using it regularly through emacs (blacken plugin). It seems to do a nice job and produce expected output.

Volans added a subscriber: Volans.Dec 12 2018, 4:38 PM

I agree that a Python formatter simplify the life of reviewers avoiding unnecessary comments over the style, but has also some draw backs as highlighted in the description.

I'd add to the Con #1 also the git pre-commit hook part, that is probably mandatory.

This article [1] seems to have a decent summary of current available tools with some examples.

On my side I've done a test on the cumin codebase with black. The results are:

  • all the ignore comments for pylint or any other validation tool were misplaced (moved to the last line when splitting) and require to manually move them to the first line [one off]
  • it doesn't pass flake8:
    • E203 whitespace before ':' (this seems a bug on their side, it's for a list slice _ARGV[index + 1 :]
    • E501 line too long (144 longest one)
  • it doesn't pass pylint (same thing for too long lines)

I personally don't like the output of black as I find it much less readable and more vertically spread than the original. We live in a world with wide but short screens so splitting into multiple lines means that less context will be readable in the same screen and the eye has to track much more indentation levels. Of course this is mostly subjective, looking forward to hear different opinions.

[1] https://medium.com/3yourmind/auto-formatters-for-python-8925065f9505

I like Black, but any formatter, as long as the barrier to entry is low, is a good idea.

The value to me is in a faster review cycle and less friction between author and reviewer. By handing control of style to a tool, it means less cognitive load at the critical point where thinking inductively about the impact to our systems and reconciling intent with the implementation is more important/valuable than where a line should (or shouldn't) split.

I don't have much to add besides saying I've been using Black (with VSCode) for all my new code here. I'd vote for it to be adopted if we had a poll but I understand the drawbacks mentioned for a larger codebase like ours. Just my 2c.

Alternatives would be autopep8 and yapf.

I like black too but from but from https://black.readthedocs.io/en/stable/installation_and_usage.html it tied to having python 3.6 installed.

Black can be installed by running pip install black. It requires Python 3.6.0+ to run

With stretch shipping with 3.4 (what do the Mac OS X versions do?) it might be a bit too restrictive to require it.

I am a bit wary of @Volans comment about vertical space usage increase. I haven't noticed that yet, but It's a point to consider indeed.

faidon added a subscriber: faidon.EditedDec 18 2018, 8:37 PM

On my side I've done a test on the cumin codebase with black. The results are:

  • all the ignore comments for pylint or any other validation tool were misplaced (moved to the last line when splitting) and require to manually move them to the first line [one off]
  • it doesn't pass flake8:
    • E203 whitespace before ':' (this seems a bug on their side, it's for a list slice _ARGV[index + 1 :]

Do you have examples of these? I wonder whether these are architectural issues or (potentially transient) bugs that we need to report against upstream.

    • E501 line too long (144 longest one)
  • it doesn't pass pylint (same thing for too long lines)

I'm not sure why you're hitting this, perhaps local misconfiguration? Black's default line length is 88, but is one of the few things that is configurable (I use -l 80). But yes, we'll need to align black's, flake8's and pylint's line length expectations :) Flake8 is actually mentioned in Black's README with a rationale and various mitigations (e.g. using Bugbear).

I personally don't like the output of black as I find it much less readable and more vertically spread than the original. We live in a world with wide but short screens so splitting into multiple lines means that less context will be readable in the same screen and the eye has to track much more indentation levels.

Do you have an example of something that Black splits into multiple lines which would be (even subjectively!) more readable without doing that?

faidon added a comment.EditedDec 18 2018, 8:47 PM

I like black too but from but from https://black.readthedocs.io/en/stable/installation_and_usage.html it tied to having python 3.6 installed.

Black can be installed by running pip install black. It requires Python 3.6.0+ to run

With stretch shipping with 3.4 (what do the Mac OS X versions do?) it might be a bit too restrictive to require it.

This was a problem for me, so what I did was build backports of Py3.6/7 to stretch. We've decided to put them in a separate component in our apt repository so that people can install them on their own systems (among other reasons).

So what you can do now is:

echo "deb http://apt.wikimedia.org/wikimedia stretch-wikimedia component/pyall" >> /etc/apt/sources.list
apt update
apt install python3.7 pipsi
pipsi install --python /usr/bin/python3.7 black
export PATH="~/.local/bin:$PATH" # and maybe put this to your .bashrc
black --help

(Instead of or in addition to pips one can use pre-commit, which supposedely handles the differing Python version and venv etc., but I haven't tried it)

On my side I've done a test on the cumin codebase with black. The results are:

  • all the ignore comments for pylint or any other validation tool were misplaced (moved to the last line when splitting) and require to manually move them to the first line [one off]
  • it doesn't pass flake8:
    • E203 whitespace before ':' (this seems a bug on their side, it's for a list slice _ARGV[index + 1 :]

Do you have examples of these? I wonder whether these are architectural issues or (potentially transient) bugs that we need to report against upstream.

Re-reading their docs it's intended behaviour and apparently is flake8 at fault here. See https://black.readthedocs.io/en/stable/the_black_code_style.html#slices

    • E501 line too long (144 longest one)
  • it doesn't pass pylint (same thing for too long lines)

I'm not sure why you're hitting this, perhaps local misconfiguration? Black's default line length is 88, but is one of the few things that is configurable (I use -l 80). But yes, we'll need to align black's, flake8's and pylint's line length expectations :) Flake8 is actually mentioned in Black's README with a rationale and various mitigations (e.g. using Bugbear).

The problem with long lines is:

Black will try to respect that. However, sometimes it won’t be able to without breaking other rules. In those rare cases, auto-formatted code will exceed your allotted limit.

Source: https://black.readthedocs.io/en/stable/the_black_code_style.html#line-length

So for example it generates this diff (with -l 120 -N -S):

-    'ls_partial_success_ratio_re':
-        r"[4-6]0\.0% \([2-3]/5\) success ratio \(< 100\.0% threshold\) for command: 'ls -la /tmp/maybe'\. Aborting.",
-    'ls_partial_success_threshold_ratio':
-        "60.0% (3/5) success ratio (>= 50.0% threshold) for command: 'ls -la /tmp/maybe'.",
+    'ls_partial_success_ratio_re': r"[4-6]0\.0% \([2-3]/5\) success ratio \(< 100\.0% threshold\) for command: 'ls -la /tmp/maybe'\. Aborting.",
+    'ls_partial_success_threshold_ratio': "60.0% (3/5) success ratio (>= 50.0% threshold) for command: 'ls -la /tmp/maybe'.",

that goes over the limit.

I personally don't like the output of black as I find it much less readable and more vertically spread than the original. We live in a world with wide but short screens so splitting into multiple lines means that less context will be readable in the same screen and the eye has to track much more indentation levels.

Do you have an example of something that Black splits into multiple lines which would be (even subjectively!) more readable without doing that?

DISCLAIMER: those are totally subjective examples.

  1. One example for the vertical verbosity (the old one is much less readable here in Phabricator as it wrap lines :( ):
-    parser.add_argument('-m', '--mode', choices=(sync_mode, async_mode),
-                        help=('Execution mode, required when there are multiple COMMANDS to be executed. In sync mode, '
-                              'execute the first command on all hosts, then proceed with the next one only if '
-                              '-p/--success-percentage is reached. In async mode, execute on each host independently '
-                              'from each other, the list of commands, aborting the execution on any given host at the '
-                              'first command that fails.'))
+    parser.add_argument(
+        '-m',
+        '--mode',
+        choices=(sync_mode, async_mode),
+        help=(
+            'Execution mode, required when there are multiple COMMANDS to be executed. In sync mode, '
+            'execute the first command on all hosts, then proceed with the next one only if '
+            '-p/--success-percentage is reached. In async mode, execute on each host independently '
+            'from each other, the list of commands, aborting the execution on any given host at the '
+            'first command that fails.'
+        ),
+    )
  1. Another vertical verbosity example
-        self.pbar_ok = tqdm(desc='PASS', total=num_hosts, leave=True, unit='hosts', dynamic_ncols=True,
-                            bar_format=colorama.Fore.GREEN + self.bar_format, file=sys.stderr)
+        self.pbar_ok = tqdm(
+            desc='PASS',
+            total=num_hosts,
+            leave=True,
+            unit='hosts',
+            dynamic_ncols=True,
+            bar_format=colorama.Fore.GREEN + self.bar_format,
+            file=sys.stderr,
+        )
  1. Exception formatting
-                raise InvalidQueryError(("Unable to parse the query '{query}' with the global grammar and no "
-                                         "default backend is set:\n{error}").format(query=query_string, error=e))
+                raise InvalidQueryError(
+                    (
+                        "Unable to parse the query '{query}' with the global grammar and no "
+                        "default backend is set:\n{error}"
+                    ).format(query=query_string, error=e)
+                )
  1. Quick logging (where it doesn't remove the now unnecessary parentheses):
-            self.logger.debug(("Provided batch_size '%d' is greater than the number of hosts '%d'"
-                               ", using '%d' as value"), batch_size, hosts_size, hosts_size)
+            self.logger.debug(
+                ("Provided batch_size '%d' is greater than the number of hosts '%d'" ", using '%d' as value"),
+                batch_size,
+                hosts_size,
+                hosts_size,
+            )
  1. Multiline if conditions:
-        if value is not None and (not isinstance(value, float) or
-                                  not (0.0 <= value <= 1.0)):  # pylint: disable=superfluous-parens
-            raise WorkerError("success_threshold must be a float beween 0 and 1, got '{value_type}': {value}".format(
-                value_type=type(value), value=value))
+        if value is not None and (
+            not isinstance(value, float) or not (0.0 <= value <= 1.0)
+        ):  # pylint: disable=superfluous-parens
+            raise WorkerError(
+                "success_threshold must be a float beween 0 and 1, got '{value_type}': {value}".format(
+                    value_type=type(value), value=value
+                )
+            )
  1. Shrinks things that because of the context might be more readable in multilines, for example dependencies, so that are also more explicit with diff when modifying them:
-    'with-openstack': [
-        'keystoneauth1>=2.4.1',
-        'python-keystoneclient>=2.3.1',
-        'python-novaclient>=3.3.1',
-    ],
-
+    'with-openstack': ['keystoneauth1>=2.4.1', 'python-keystoneclient>=2.3.1', 'python-novaclient>=3.3.1'],

I've tried experimenting how CI integration would look like for repositories that choose to enforce code formatting, e.g. with tox and for example https://pypi.org/project/isort/ + https://pypi.org/project/black/.

[testenv:formattercheck]
skipsdist = True
skip_install = True
basepython = python3.6
deps =
 black
 isort
commands =
  isort --check --line-width 88 --skip-glob '.tox' --skip-glob '.eggs'
  black --check --line-length 88 --exclude '/(\.eggs|\.git|\.hg|\.mypy_cache|\.nox|\.tox|\.venv|_build|buck-out|build|dist)/' .
crusnov moved this task from Backlog to In Progress on the SRE-tools board.Feb 14 2019, 5:50 PM

Change 494259 had a related patch set uploaded (by Gilles; owner: Gilles):
[integration/config@master] Add CI for new performance/asoranking repo

https://gerrit.wikimedia.org/r/494259

jbond added a subscriber: jbond.Mar 13 2019, 2:53 PM
GTirloni removed a subscriber: GTirloni.Mar 21 2019, 9:06 PM

Not Python-related but noting it here, I came across this shell formatter which would achieve the same but for shell https://github.com/mvdan/sh

As I've seen some efforts to use black also in other part of the org, ideally it would be nice to have a single way to set it up:

  • black configuration (line length, quotes)
  • tox environment to ensure the code is black'ed in CI
  • easy as much automatic as possible way to integrate it into the development (IDE/editor configs, git hook, etc.)

Agreed with what @Volans said, I've been working on tox-wikimedia (https://pypi.org/project/tox-wikimedia/) which will hopefully be able to contain and be the the authoritative way to set up whichever code formatter we end up using. My POC patch (since black is being used in cloud services repos) is https://gerrit.wikimedia.org/r/c/integration/tox-wikimedia/+/566407

On the topic of code formatters in general, I'm +1, but +0 on black. I'm not a fan of the "uncompromising" philosophy, because I end up compromising when trying to read code rather than when I'm writing it (and I do more of the former). One of my friends created lavender, which alleviated most of the things I don't like about black's imposed code style.

Volans moved this task from In Progress to Backlog on the SRE-tools board.Jul 27 2020, 10:42 AM

Change 554827 had a related patch set uploaded (by Jbond; owner: John Bond):
[operations/puppet@production] ci - flake8: update flake8 rules to be compatible with black

https://gerrit.wikimedia.org/r/554827

Change 554825 had a related patch set uploaded (by Jbond; owner: John Bond):
[operations/puppet@production] CI - black: update python3 files with black

https://gerrit.wikimedia.org/r/554825

Change 554826 had a related patch set uploaded (by Jbond; owner: John Bond):
[operations/puppet@production] CI - black: run black over python2 files

https://gerrit.wikimedia.org/r/554826

Change 553487 had a related patch set uploaded (by Jbond; owner: John Bond):
[operations/puppet@production] CI - taskgen: add black tests for python2 and python3 files

https://gerrit.wikimedia.org/r/553487

Change 621724 had a related patch set uploaded (by Kormat; owner: Kormat):
[operations/software/wmfmariadbpy@master] Add 'black' formatter support.

https://gerrit.wikimedia.org/r/621724

Change 621724 merged by jenkins-bot:
[operations/software/wmfmariadbpy@master] Add 'black' formatter support.

https://gerrit.wikimedia.org/r/621724

Volans added a comment.Sep 1 2020, 8:24 AM

[follow up from a chat in #wikimedia-databases]
Another aspect that should be clarified is how to manage changes introduced by the code formatter itself between different versions of the formatter. Both pinning + manual update or random failures of CI don't seem optimal solutions.

Bstorm added a subscriber: Bstorm.Dec 3 2020, 1:53 AM

FWIW, WMCS uses black with line-length set to 80 for all python and has for a little while. In non-puppet repos, we have tox check for it (any line length other than 80 generates failures). https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/Python_coding
Our team doesn't agree on text editors, so I can't say we have anything to offer on the setup part of the equation.

Just noting here that we would like not to have CI call it all broken in the future, but if we have to change it all in puppet as a special case, we can. We run checks for it with tox outside of puppet like:

[testenv:black]
description = check black formatter
basepython = {[default]basepython}
commands = black -l 80 -t py37 --check \
           maintain_kubeusers \
           maintain_kubeusers.py
deps = black

Puppet is a more widely shared space, so we generally just try to make sure the output is ok for flake8 after black hits it.

Change 554825 abandoned by Jbond:
[operations/puppet@production] CI - black: update python3 files with black

Reason:
This change is now IMO too old. to much will have changed and we should regenerate this CR from scratch if we want to move ahead wit it

https://gerrit.wikimedia.org/r/554825

Change 554826 abandoned by Jbond:
[operations/puppet@production] CI - black: run black over python2 files

Reason:
This change is now IMO too old. to much will have changed and we should regenerate this CR from scratch if we want to move ahead wit it

https://gerrit.wikimedia.org/r/554826