Page MenuHomePhabricator

Introduce Python code formatters usage
Open, NormalPublic

Description

Context: during a discussion at Infrastructure Foundations team meeting the subject of tools automatically (re)formatting Python code came up.
This task is to track the rationale for having such tools for (new) Python code and implementation.

Pros:

  1. Authors don't have to think about code formatting
  2. Code formatting comments no longer required during code reviews
  3. Code readability is the same across authors

Cons:

  1. Local setup needed: formatter installation and editor integration at least
  2. Similar setup needed for integration with CI to fail if code isn't formatted as required
  3. If/when the code formatter output changes e.g. across upgrades we'll have to adjust for that

Open questions:

  1. Which formatter(s) to use?
  2. Require standard formatting for old and new code or only new or a mix depending on the codebase?
  3. How much local setup is needed depending on the formatter(s) and the editor(s)?
  4. Likely other questions

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 12 2018, 9:54 AM
ema triaged this task as Normal priority.Dec 12 2018, 10:48 AM
ema added a subscriber: ema.
Eevans added a subscriber: Eevans.Dec 12 2018, 3:56 PM
crusnov added a subscriber: crusnov.EditedDec 12 2018, 4:30 PM

Just to throw in, since this was discussed perhaps 2 weeks ago on the meeting, I installed Black with the width settings we normally use for Cumin and have been using it regularly through emacs (blacken plugin). It seems to do a nice job and produce expected output.

Volans added a subscriber: Volans.Dec 12 2018, 4:38 PM

I agree that a Python formatter simplify the life of reviewers avoiding unnecessary comments over the style, but has also some draw backs as highlighted in the description.

I'd add to the Con #1 also the git pre-commit hook part, that is probably mandatory.

This article [1] seems to have a decent summary of current available tools with some examples.

On my side I've done a test on the cumin codebase with black. The results are:

  • all the ignore comments for pylint or any other validation tool were misplaced (moved to the last line when splitting) and require to manually move them to the first line [one off]
  • it doesn't pass flake8:
    • E203 whitespace before ':' (this seems a bug on their side, it's for a list slice _ARGV[index + 1 :]
    • E501 line too long (144 longest one)
  • it doesn't pass pylint (same thing for too long lines)

I personally don't like the output of black as I find it much less readable and more vertically spread than the original. We live in a world with wide but short screens so splitting into multiple lines means that less context will be readable in the same screen and the eye has to track much more indentation levels. Of course this is mostly subjective, looking forward to hear different opinions.

[1] https://medium.com/3yourmind/auto-formatters-for-python-8925065f9505

I like Black, but any formatter, as long as the barrier to entry is low, is a good idea.

The value to me is in a faster review cycle and less friction between author and reviewer. By handing control of style to a tool, it means less cognitive load at the critical point where thinking inductively about the impact to our systems and reconciling intent with the implementation is more important/valuable than where a line should (or shouldn't) split.

I don't have much to add besides saying I've been using Black (with VSCode) for all my new code here. I'd vote for it to be adopted if we had a poll but I understand the drawbacks mentioned for a larger codebase like ours. Just my 2c.

Alternatives would be autopep8 and yapf.

I like black too but from but from https://black.readthedocs.io/en/stable/installation_and_usage.html it tied to having python 3.6 installed.

Black can be installed by running pip install black. It requires Python 3.6.0+ to run

With stretch shipping with 3.4 (what do the Mac OS X versions do?) it might be a bit too restrictive to require it.

I am a bit wary of @Volans comment about vertical space usage increase. I haven't noticed that yet, but It's a point to consider indeed.

faidon added a subscriber: faidon.EditedDec 18 2018, 8:37 PM

On my side I've done a test on the cumin codebase with black. The results are:

  • all the ignore comments for pylint or any other validation tool were misplaced (moved to the last line when splitting) and require to manually move them to the first line [one off]
  • it doesn't pass flake8:
    • E203 whitespace before ':' (this seems a bug on their side, it's for a list slice _ARGV[index + 1 :]

Do you have examples of these? I wonder whether these are architectural issues or (potentially transient) bugs that we need to report against upstream.

    • E501 line too long (144 longest one)
  • it doesn't pass pylint (same thing for too long lines)

I'm not sure why you're hitting this, perhaps local misconfiguration? Black's default line length is 88, but is one of the few things that is configurable (I use -l 80). But yes, we'll need to align black's, flake8's and pylint's line length expectations :) Flake8 is actually mentioned in Black's README with a rationale and various mitigations (e.g. using Bugbear).

I personally don't like the output of black as I find it much less readable and more vertically spread than the original. We live in a world with wide but short screens so splitting into multiple lines means that less context will be readable in the same screen and the eye has to track much more indentation levels.

Do you have an example of something that Black splits into multiple lines which would be (even subjectively!) more readable without doing that?

faidon added a comment.EditedDec 18 2018, 8:47 PM

I like black too but from but from https://black.readthedocs.io/en/stable/installation_and_usage.html it tied to having python 3.6 installed.

Black can be installed by running pip install black. It requires Python 3.6.0+ to run

With stretch shipping with 3.4 (what do the Mac OS X versions do?) it might be a bit too restrictive to require it.

This was a problem for me, so what I did was build backports of Py3.6/7 to stretch. We've decided to put them in a separate component in our apt repository so that people can install them on their own systems (among other reasons).

So what you can do now is:

echo "deb http://apt.wikimedia.org/wikimedia stretch-wikimedia component/pyall" >> /etc/apt/sources.list
apt update
apt install python3.7 pipsi
pipsi install --python /usr/bin/python3.7 black
export PATH="~/.local/bin:$PATH" # and maybe put this to your .bashrc
black --help

(Instead of or in addition to pips one can use pre-commit, which supposedely handles the differing Python version and venv etc., but I haven't tried it)

On my side I've done a test on the cumin codebase with black. The results are:

  • all the ignore comments for pylint or any other validation tool were misplaced (moved to the last line when splitting) and require to manually move them to the first line [one off]
  • it doesn't pass flake8:
    • E203 whitespace before ':' (this seems a bug on their side, it's for a list slice _ARGV[index + 1 :]

Do you have examples of these? I wonder whether these are architectural issues or (potentially transient) bugs that we need to report against upstream.

Re-reading their docs it's intended behaviour and apparently is flake8 at fault here. See https://black.readthedocs.io/en/stable/the_black_code_style.html#slices

    • E501 line too long (144 longest one)
  • it doesn't pass pylint (same thing for too long lines)

I'm not sure why you're hitting this, perhaps local misconfiguration? Black's default line length is 88, but is one of the few things that is configurable (I use -l 80). But yes, we'll need to align black's, flake8's and pylint's line length expectations :) Flake8 is actually mentioned in Black's README with a rationale and various mitigations (e.g. using Bugbear).

The problem with long lines is:

Black will try to respect that. However, sometimes it won’t be able to without breaking other rules. In those rare cases, auto-formatted code will exceed your allotted limit.

Source: https://black.readthedocs.io/en/stable/the_black_code_style.html#line-length

So for example it generates this diff (with -l 120 -N -S):

-    'ls_partial_success_ratio_re':
-        r"[4-6]0\.0% \([2-3]/5\) success ratio \(< 100\.0% threshold\) for command: 'ls -la /tmp/maybe'\. Aborting.",
-    'ls_partial_success_threshold_ratio':
-        "60.0% (3/5) success ratio (>= 50.0% threshold) for command: 'ls -la /tmp/maybe'.",
+    'ls_partial_success_ratio_re': r"[4-6]0\.0% \([2-3]/5\) success ratio \(< 100\.0% threshold\) for command: 'ls -la /tmp/maybe'\. Aborting.",
+    'ls_partial_success_threshold_ratio': "60.0% (3/5) success ratio (>= 50.0% threshold) for command: 'ls -la /tmp/maybe'.",

that goes over the limit.

I personally don't like the output of black as I find it much less readable and more vertically spread than the original. We live in a world with wide but short screens so splitting into multiple lines means that less context will be readable in the same screen and the eye has to track much more indentation levels.

Do you have an example of something that Black splits into multiple lines which would be (even subjectively!) more readable without doing that?

DISCLAIMER: those are totally subjective examples.

  1. One example for the vertical verbosity (the old one is much less readable here in Phabricator as it wrap lines :( ):
-    parser.add_argument('-m', '--mode', choices=(sync_mode, async_mode),
-                        help=('Execution mode, required when there are multiple COMMANDS to be executed. In sync mode, '
-                              'execute the first command on all hosts, then proceed with the next one only if '
-                              '-p/--success-percentage is reached. In async mode, execute on each host independently '
-                              'from each other, the list of commands, aborting the execution on any given host at the '
-                              'first command that fails.'))
+    parser.add_argument(
+        '-m',
+        '--mode',
+        choices=(sync_mode, async_mode),
+        help=(
+            'Execution mode, required when there are multiple COMMANDS to be executed. In sync mode, '
+            'execute the first command on all hosts, then proceed with the next one only if '
+            '-p/--success-percentage is reached. In async mode, execute on each host independently '
+            'from each other, the list of commands, aborting the execution on any given host at the '
+            'first command that fails.'
+        ),
+    )
  1. Another vertical verbosity example
-        self.pbar_ok = tqdm(desc='PASS', total=num_hosts, leave=True, unit='hosts', dynamic_ncols=True,
-                            bar_format=colorama.Fore.GREEN + self.bar_format, file=sys.stderr)
+        self.pbar_ok = tqdm(
+            desc='PASS',
+            total=num_hosts,
+            leave=True,
+            unit='hosts',
+            dynamic_ncols=True,
+            bar_format=colorama.Fore.GREEN + self.bar_format,
+            file=sys.stderr,
+        )
  1. Exception formatting
-                raise InvalidQueryError(("Unable to parse the query '{query}' with the global grammar and no "
-                                         "default backend is set:\n{error}").format(query=query_string, error=e))
+                raise InvalidQueryError(
+                    (
+                        "Unable to parse the query '{query}' with the global grammar and no "
+                        "default backend is set:\n{error}"
+                    ).format(query=query_string, error=e)
+                )
  1. Quick logging (where it doesn't remove the now unnecessary parentheses):
-            self.logger.debug(("Provided batch_size '%d' is greater than the number of hosts '%d'"
-                               ", using '%d' as value"), batch_size, hosts_size, hosts_size)
+            self.logger.debug(
+                ("Provided batch_size '%d' is greater than the number of hosts '%d'" ", using '%d' as value"),
+                batch_size,
+                hosts_size,
+                hosts_size,
+            )
  1. Multiline if conditions:
-        if value is not None and (not isinstance(value, float) or
-                                  not (0.0 <= value <= 1.0)):  # pylint: disable=superfluous-parens
-            raise WorkerError("success_threshold must be a float beween 0 and 1, got '{value_type}': {value}".format(
-                value_type=type(value), value=value))
+        if value is not None and (
+            not isinstance(value, float) or not (0.0 <= value <= 1.0)
+        ):  # pylint: disable=superfluous-parens
+            raise WorkerError(
+                "success_threshold must be a float beween 0 and 1, got '{value_type}': {value}".format(
+                    value_type=type(value), value=value
+                )
+            )
  1. Shrinks things that because of the context might be more readable in multilines, for example dependencies, so that are also more explicit with diff when modifying them:
-    'with-openstack': [
-        'keystoneauth1>=2.4.1',
-        'python-keystoneclient>=2.3.1',
-        'python-novaclient>=3.3.1',
-    ],
-
+    'with-openstack': ['keystoneauth1>=2.4.1', 'python-keystoneclient>=2.3.1', 'python-novaclient>=3.3.1'],

I've tried experimenting how CI integration would look like for repositories that choose to enforce code formatting, e.g. with tox and for example https://pypi.org/project/isort/ + https://pypi.org/project/black/.

[testenv:formattercheck]
skipsdist = True
skip_install = True
basepython = python3.6
deps =
 black
 isort
commands =
  isort --check --line-width 88 --skip-glob '.tox' --skip-glob '.eggs'
  black --check --line-length 88 --exclude '/(\.eggs|\.git|\.hg|\.mypy_cache|\.nox|\.tox|\.venv|_build|buck-out|build|dist)/' .

Change 494259 had a related patch set uploaded (by Gilles; owner: Gilles):
[integration/config@master] Add CI for new performance/asoranking repo

https://gerrit.wikimedia.org/r/494259

jbond added a subscriber: jbond.Mar 13 2019, 2:53 PM
GTirloni removed a subscriber: GTirloni.Thu, Mar 21, 9:06 PM