Page MenuHomePhabricator

Decision request: python source code line length
Closed, ResolvedPublic

Description

Problem

For python code-bases, we currently have an standard of 80 characters per line, documented at https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/Python_coding

The 80 characters per line is something that was introduced decades ago when screens were small.
Nowadays, virtually nobody codes or reads code on a setup such as:

  • can't fit more than 80 characters
  • can't zoom or scroll horizontally to overcome such limitation

Some other major projects have abandon such limit already.

The python pep8 standard recommends 79, but explicitly mentions that coding teams can increase the limit if they agree on it.

Moreover, we already have some amount of divergence among our own project's source trees.

This decision request is to review such limit.

NOTE: The truth is that the WMF has organically standarized on tox + black. The toolchain is not under discussion (today at least).

Constraints and risks

  • N/A

Decision record

In progress.

Options

Option 1

Leave it like it is now. 80 characters.

Pros:

  • TBD.

Cons:

  • TBD.

Option 2

Introduce 100 characters line limit.

Pros:

  • TBD.

Cons:

  • TBD.

Option 2

Introduce 120 characters line limit.

Pros:

  • TBD.

Cons:

  • TBD.

Event Timeline

I've been using Black's default line length of 88 characters since I started using Black a few years ago. I quite like their rationale for this number:

You probably noticed the peculiar default line length. Black defaults to 88 characters per line, which happens to be 10% over 80. This number was found to produce significantly shorter files than sticking with 80 (the most popular), or even 79 (used by the standard library). In general, 90-ish seems like the wise choice.

If you’re paid by the line of code you write, you can pass --line-length with a lower number. Black will try to respect that. However, sometimes it won’t be able to without breaking other rules. In those rare cases, auto-formatted code will exceed your allotted limit.

You can also increase it, but remember that people with sight disabilities find it harder to work with line lengths exceeding 100 characters. It also adversely affects side-by-side diff review on typical screen resolutions. Long lines also make it harder to present code neatly in documentation or talk slides.

Source: https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html

I do not have strong feelings about this. I would be reluctant to introduce any new rule that leaves a lot of our existing code non-compliant, but that's only a danger if we decide on a line-length that's shorted than the standard in existing repos.

I personally like the 80-character limit, it's true that screens are now much larger than they used to be, but I find shorter lines much easier to read, and I really like being able to open two files side-by-side. On my 16-inch laptop at my current font size I can fit about 200 columns, and on my previous smaller laptop I could fit about 170, so I find that 80 is still a reasonable setting even on modern computers, if you frequently open two files side-by-side.

So my personal preference is to keep the limit at 80, but I'm also happy to increase it to 88 or 100 if other people find it useful.

I don't have strong opinions and don't have the interest/energy to argue about this very much, but:

  • I don't want to think about this when writing code, and instead want a formatter that does it automatically for me (black does this currently and I think this is not about changing that, correct?)
  • If I had to choose some value, I'd go with 88 or 100. 80 is a bit too short and it forces splitting many lines that I'd personally not split, and longer lines are hard to use when having two files or terminal windows open at the same time.

The only strong opinion I have on this would be to utilize a code formatter, and very strongly consider the default settings. I used to be annoyed by oddly split lines just over the limit and for autoformatting to change my logically laid out code; but in the end consistently formatted code was more useful than my attempts at curation. My time was better spent writing code and not formatting.

The truth is that the WMF has organically standarized on tox + black for auto formatting. I don't think that part is controversial. Or at least, is not my intention that we discuss it in this ticket.

The truth is that the WMF has organically standarized on tox + black for auto formatting. I don't think that part is controversial. Or at least, is not my intention that we discuss it in this ticket.

This team explicitly standardized on black and 80 char line lengths in the past as documented on https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/Python_coding. We did that prior to any other WMF teams to my knowledge.

The 80 characters per line is something that was introduced decades ago when screens were small.
Nowadays, virtually nobody codes or reads code on a setup such as:

can't fit more than 80 characters
can't zoom or scroll horizontally to overcome such limitation

I guess that I should be ashamed to say that I deliberately choose to develop software using an 80-character wide text terminal and vim with the framing given for this change. Rather than using ageist language to belittle me, could someone please actually make a reasoned argument for longer line lengths.

The only strong opinion I have in this subject, is to make sure that whichever configuration is chosen, it's setup in a way that running the tools by themselves (ex. black/isort/... whatever tools we use) pick up the configuration without the need of extra tweaks.

Then as a datapoint, my setup allows me to comfortably work with 2 ~125 column files, side-by-side, as I use a big screen most of the time. I avoid using my laptop regularly as the screen is too small anyhow (I end up trying to play with zooms/resolutions and never get a comfortable setup), so I can go by with whatever option is decided.

Said that, I weakly vote for 100, but would not be disturbed by using any other line length.

Note again that this has to go paired with setting up the config in a place that the tooling will find by default and without wrappers (that allows me to easily set it up on any editor).

I guess that I should be ashamed to say that I deliberately choose to develop software using an 80-character wide text terminal and vim with the framing given for this change. Rather than using ageist language to belittle me, could someone please actually make a reasoned argument for longer line lengths.

You ask for reasoned arguments to counter a deliberate choice of yours. A choice that doesn't seem reasoned either.
I don't think that kind of debate would be a good use of our time. I'm sorry you felt discriminated, that wasn't my intention.

I will, however, document a few divergences from the established default of 80:

(...I got bored and stopped searching)

I interpret the divergences as fatigue from the 80 limit.

I guess that I should be ashamed to say that I deliberately choose to develop software using an 80-character wide text terminal and vim with the framing given for this change. Rather than using ageist language to belittle me, could someone please actually make a reasoned argument for longer line lengths.

You ask for reasoned arguments to counter a deliberate choice of yours. A choice that doesn't seem reasoned either.

https://peps.python.org/pep-0008/#maximum-line-length has a number of reasons, the strongest of which may be "The default wrapping in most tools disrupts the visual structure of the code, making it more difficult to understand. The limits are chosen to avoid wrapping in editors with the window width set to 80, even if the tool places a marker glyph in the final column when wrapping lines. Some web based tools may not offer dynamic line wrapping at all."

If the question is about the reasons I deliberately choose to work within an 80 column window it is a combination of familiarity and ease of moving from device to device. Nearly every device I have ever used that is capable of ssh is also capable of 80 column output.

I will, however, document a few divergences from the established default of 80:

Flake8 here is configured to allow 120 char lines, but black is actually configured for 80 char lines.

Here black is actually configured for 79 columns.

Did you deliberately ignore the prior consensus at https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/Python_coding when you configured these two repos or did we not sufficiently socialize the prior decision?

  • toolforge grid-webservices (using 100) -- what's this even?

Some tool that Taavi made which is not bound by any agreements of the Wikimedia Cloud Services team.

This is more stuff made by Taavi and not by the WMCS team.

(...I got bored and stopped searching)

I interpret the divergences as fatigue from the 80 limit.

Andrew claimed this task.

I'm learning lots of good lessons about coding (and especially about me + coding) watching Jenna's python skills develop. When she complains about weird coding style in projects at her work, I always say the same thing:

"The only right coding style is the style that the project is already using. When you start a new project then you can use whatever style you want."

I've thought a lot about that first sentence over the years (largely when encountering a huge, baffling diff in git history or suffering through a holy war that erupts from such a diff) but not a lot about the second. Today, I suddenly think the second statement is equally important. It's FUN to feel ownership over a project and make it just the way you want it. Despite all our concerns about lottery factor, we do tend to feel individual ownership over certain projects. It's fun, and satisfying, to feel ownership! Despite the inherent risks around individual ownership, it's also critical to not spoil anybody's fun.

So, I'm using my team-lead authority to impose the following coding rules on our team:

0 We continue to develop default internal coding styles, but they are defaults rather than universal rules. Where there is no existing consensus (or the cost of consensus is high, as today) there is no default specified.

1 History wins. For existing projects and existing repos, the existing style stays. If the project is not internally consistent, go with the majority (the majority of existing code that is, not the majority of people.) Retroactive total reformat of existing projects should happen rarely if ever.

2 New projects, or projects that are young enough to have no obvious existing conventions, adopt whatever style their owner/creator/primary contributor(s) want(s).

3 Any deviations from the rule 0 defaults (or specification of an undefined convention like line length) are declared in prose in a HACKING file at the top level of the repo*. HACKING should also ideally include whatever explicit rules and examples are needed to configure flake8, black, or whatever tools are used for validation. HACKING may also include advocacy about why these standards are the best, but only in a separate section from the actual description of the standards.*

4 When you're working in a repo with a coding style that strike you as obviously wrong, you are encouraged to sigh loudly, express dismay, and even complain on IRC, but ONLY if you then take a moment to acknowledge that others are doing the same about your repos. If we're lucky, our names will live on as curses on the lips of future devs long after we're gone.

*This resembles the current OpenStack practice. I personally tried to unify coding standards across OpenStack projects 20 releases ago, failed, and am trying to make use of some lessons I learned then.