Page MenuHomePhabricator

Decision request - Update python team best practices
Closed, ResolvedPublic

Description

Problem

The current best practices page https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/Python_coding does not reflect the practices used in the team maintained repositories (ex. https://gitlab.wikimedia.org/repos/cloud/).

It seems also way more prescriptive that we might want it to be (ex. not explicitly giving space for different configs/special cases/trials).

Those are two different but very related things, that's why they are bundled in one decision request, if we need more space to focus on one or the other, I'll split the request in two.

Constraints and risks

  • Tool sprawl
  • Overall readability
  • Cognitive overhead (all the points are related)
  • Style wars

Decision record

Option B5 chosen.

https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Decision_record_T361804_Update_python_team_best_practices

Options

The choices are a combination of A/B and 1/2/3/4/5/...

For example A3 means "Make it prescriptive" and "flake8 + black + isort + 80col + pre-commit...".

While B2 means "Make it explicitly non-prescriptive" and "flake8 + black + 80col"

Prescriptiveness

Option A

Leave it prescriptive, start modifying the current linting rules to match the ones specified in the wiki page.

Pros:

  • Homogeneous settings and tooling

Cons:

  • Less space for exceptions/special cases/trials/etc.
Option B

Make it explicitly non-prescriptive, and don't enforce/actively change our linting checks to match the wiki page.

This might be done by adding some note like "this is a guideline, not a strict rule, use common sense when applying it"

Pros:

  • More space for exceptions/special cases/trials/etc.

Cons:

  • Might lead to less homogeneous settings and tooling

Specific tools and settings used

Option 1

Don't change the linting rules, don't change the prescriptiveness (if that is a word), this option is to do absolutely nothing on any side.

Pros:

  • Nothing to do on the wiki

Cons:

  • Nothing improves and practices and tools continue diverging with time
Option 2

Keep the current mentioned linting rules (flake8 + black + 80 column limit).

Pros:

  • All tools synced (when choosing A)

Cons:

  • Missing some useful tools/checks
  • Using some custom configuration
Option 3

Replace all our check with what the page says (flake8 + black + 80 column limit) and add some extra checkers:

  • pre-commit: for the linting/style checks runner
  • mypy: static type checking
  • isort: import sorting
  • trailing-whitespace
  • end-of-file-fixer
  • check-toml
  • check-yaml
  • check-shebang-scripts-are-executable
  • check-executables-have-shebangs
  • check-merge-conflict

Pros:

  • All tools synced (when choosing A)
  • Nice extra checks
  • No custom configuration

Cons:

  • None
Option 4

Use the check the current page says, but remove the column limit (use the default for black/isort) and add the extra checks:

  • pre-commit: for the linting/style checks runner
  • mypy: static type checking
  • isort: import sorting
  • trailing-whitespace
  • end-of-file-fixer
  • check-toml
  • check-yaml
  • check-shebang-scripts-are-executable
  • check-executables-have-shebangs
  • check-merge-conflict

Pros:

  • All tools synced (when choosing A)
  • Nice extra checks
  • No custom configuration

Cons:

  • None
Option 5

Use a fully new set of checkers (using more modern checker ruff to replace isort, flake8 and black, and using default column limits). This means using:

  • pre-commit: for the linting/style checks runner
  • ruff: replaces isort, flake8 and black (and does some static type checking too, not a replacement for mypy)
  • mypy: static type checking
  • trailing-whitespace
  • end-of-file-fixer
  • check-toml
  • check-yaml
  • check-shebang-scripts-are-executable
  • check-executables-have-shebangs
  • check-merge-conflict

Pros:

  • All tools synced (when choosing A)
  • Nice extra checks
  • No custom configuration
  • Modern tooling

Cons:

  • None
Option N

Add more options here

Event Timeline

dcaro renamed this task from Decision request template - Update python team best practices to Decision request - Update python team best practices.Apr 4 2024, 9:34 AM
dcaro triaged this task as Medium priority.
dcaro updated the task description. (Show Details)

My preference would be to continue with what I perceive is the status quo, which I would describe as something like this:

We are not following any explicit best practices, but have been converging to some workflows such as using pre-commit with a specific set of hooks, using tox as a general task runner in Python repositories, automations like version bump scripts, etc. This has happened organically for the most part, and taken together with the migration to gitlab, lima-kilo, and better ci/cd, has lead to a (rather significant, if you ask me) increase in developer experience over the last year or so.

I don't personally feel like there is significant pushback against introducing new tools or workflows. I've been toying with the idea lately of dropping flakes8/isort/black in favor of ruff, and if I haven't proposed it yet it's more because there are too many other items higher up on my to-do list rather than perceiving the culture to be change resistant. It will always be the case that the burden of convincing others to try something new vs. continuing with the status quo will be on the instigator of the change, but I think that's more a general fact of life than of our team practices.

To sum it up, I don't believe option 1 leads to no improvements and practices and tools that continue diverging with time, because experience has shown otherwise. Maybe our repositories are not perfectly homogenous and there's a commit hook or something else missing here and there, but it would be enough to create a task about it and just fix it up, or submit ad-hoc improvements to pre-commit/utils/docs each time we see something missing while doing other work on the repos, which is also something I think we are already doing spontaneously most of the time.

I think this means I'm most aligned with options B & 1, because I believe this approach ultimately leads (close enough) to the outcome of option 5.

I think this means I'm most aligned with options B & 1, because I believe this approach ultimately leads (close enough) to the outcome of option 5.

So, do you want option 5 or 1?
What I understand from this is that you want B5, and to get it, you vote for B1. I would encourage you to vote what you really want to happen, not what you think you should vote to make what you want happen (if that makes sense).

I'm partial to B5, but B* would be good enough, as that's what we have been doing essentially (adding the B just makes it explicit that we are lax with the practices, something I think currently can be interpreted differently).

I want option 1 because it's the only one that is not prescriptive. To me it's a coincidence that option 5 aligns with what I think would happen by going with B1. That said, if the majority vote is B5, I'm fine with that.

I want option 1 because it's the only one that is not prescriptive. To me it's a coincidence that option 5 aligns with what I think would happen by going with B1. That said, if the majority vote is B5, I'm fine with that.

I think there might be something not clear in the options xd

Anything with 'B' in front is non-prescriptive (A -> prescriptive, B -> non-prescriptive), the number is just the checks to do (if paired with A then in a prescriptive manner, if paired with B in a non-prescriptive one).

Can you elaborate on why you see B1 as the only one non-prescriptive? (so I can modify the wording/etc.)

Note that the fact that 1 has less checks, does not mean it's not prescriptive, just that when strictly applied you would actually have less checks (ex. remove mypy from the tests run).

I think there might be something not clear in the options xd

Anything with 'B' in front is non-prescriptive (A -> prescriptive, B -> non-prescriptive), the number is just the checks to do (if paired with A then in a prescriptive manner, if paired with B in a non-prescriptive one).

Can you elaborate on why you see B1 as the only one non-prescriptive? (so I can modify the wording/etc.)

Indeed, I think that B is incompatible with any option that says "let's agree on doing this specific list of things". So (to me) B implies 1. In practice, this combination would mean continuing with the status quo and editing https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/Python_coding to be descriptive of actual practices, and keep updating it as these evolve.

Indeed, I think that B is incompatible with any option that says "let's agree on doing this specific list of things". So (to me) B implies 1. In practice, this combination would mean continuing with the status quo and editing https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/Python_coding to be descriptive of actual practices, and keep updating it as these evolve.

I see, yes, that's not what I wanted to convey, the idea is:

  • Anything with B*: "let's write down the things we currently do, and put a note saying that that's the recommended setup", and then the number is "the things that we currently do".
  • Anything with A*: "let's write down the setup we want everywhere and start applying it everywhere", and then the number is "the setup we want".

So some examples would mean:

  • B1: update the current wiki page to say it's the 'recommended practices', not to be enforced, (and keeping that we use only black with 80 col limit)
  • B5: the above plus "mention that the current practices are using pre-commit with the following checks enabled ...<ruff+mypy+...>"
  • A1: don't update the current wiki page, and start changing our linting checks in the repos to only black + 80 col limit
  • A5: modify the wiki page to mention the pre-commit+extra checks and start changing our linting in the repos to use that exact list+options

I don't see that we have to be explicit in the wiki page about what we currently do, other than recommending "follow the conventions in whatever repository you are contributing to", "dare propose changes as you see fit", and if you start a new repository "use common sense and consult with whomever else will be the main contributors".

Other than the wiki style guide being out of sync with actual practices, I don't see any problem that needs solving.

I don't see that we have to be explicit in the wiki page about what we currently do, other than recommending "follow the conventions in whatever repository you are contributing to", "dare propose changes as you see fit", and if you start a new repository "use common sense and consult with whomever else will be the main contributors".

Other than the wiki style guide being out of sync with actual practices, I don't see any problem that needs solving.

Fair enough, I do think that it's good to keep the page somewhat updated, so if you are creating a new repo or similar you have a reference to look up for (as it is, I would probably re-invent whatever checks I used in some other random repo instead).

I do think that it's good to keep the page somewhat updated, so if you are creating a new repo or similar you have a reference to look up for (as it is, I would probably re-invent whatever checks I used in some other random repo instead).

Maybe we could designate one repo for each programming language (now that some are in Go) as "the canonical example" and point to it? Then, if we want to experiment with changes, do so in other repos until we are fairly sure we have reached as new status quo?

I do think that it's good to keep the page somewhat updated, so if you are creating a new repo or similar you have a reference to look up for (as it is, I would probably re-invent whatever checks I used in some other random repo instead).

Maybe we could designate one repo for each programming language (now that some are in Go) as "the canonical example" and point to it? Then, if we want to experiment with changes, do so in other repos until we are fairly sure we have reached as new status quo?

Sounds good to me (as in the page saying, here is the latest setup that we use for easy copy-paste -> link to repo).

I vote B5, with the addition of a link to a "canonical example". I would still write a short summary in the wiki of what is used in the "canonical example", but we could keep the wiki page relatively short, and point to the repo for the details.

dcaro claimed this task.

Unless I understood wrongly, we seem to agree on going with option B5 with an example repository, I'll do that and close this decision task (please reply here if I'm understanding wrongly).

Thanks everyone! \o/

I wrote https://wikitech.wikimedia.org/wiki/User:David_Caro/Coding_proposal as a generic coding page, as the pre-commit checks and such include golang, shell, yaml and others, any objections for me to move https://wikitech.wikimedia.org/w/index.php?title=Wikimedia_Cloud_Services_team/Python_coding -> https://wikitech.wikimedia.org/w/index.php?title=Wikimedia_Cloud_Services_team/Coding and using what's there?

Feel free to edit/add comments in the talk page, I'll do the above next week if nobody has any opinion.

I'm late to this, but I also agree with B5. Do we need another rule regarding what to do with existing code when applying new checks or standards and would result in a reformat?

I'm late to this, but I also agree with B5. Do we need another rule regarding what to do with existing code when applying new checks or standards and would result in a reformat?

I would say no, specially as this decision does not force anyone to change anything.

A rule of the thumb though would be to separate those changes in it's own MR (if the changes are big), or in a first commit inside the MR (if they are small).

I'm late to this, but I also agree with B5. Do we need another rule regarding what to do with existing code when applying new checks or standards and would result in a reformat?

I would say no, specially as this decision does not force anyone to change anything.

A rule of the thumb though would be to separate those changes in it's own MR (if the changes are big), or in a first commit inside the MR (if they are small).

That does not mean that we should not though, if you think it's something we should have a discussion about, feel free to open a new decision request

I'm late to this, but I also agree with B5. Do we need another rule regarding what to do with existing code when applying new checks or standards and would result in a reformat?

I would say no, specially as this decision does not force anyone to change anything.

A rule of the thumb though would be to separate those changes in it's own MR (if the changes are big), or in a first commit inside the MR (if they are small).

Yep, as long as we encourage that format changes are in their own (ideally repo-wide) patch we shouldn't need any other guidance.