Page MenuHomePhabricator

Stop using Differential for code review
Open, LowPublic

Description

IMHO the current state is the worst possible outcome. In addition to what we already had for most (gerrit) or some (github) of our repos, we now have a third place for some others (some tools on toolforge, Scap and a few more host their code on diffusion and do code review on differential).
Especially from a new contributors perspective, this is bad. Depending on which project I want to contribute to I now have to familiarise myself with github, gerrit and/or differential. Personally, when I started contributing here about a year ago, I didn't know about any of those and wouldn't have cared which one to learn. But by now, I had to dig into differential as well as gerrit, because I wanted to contribute to different projects over time.
This is an acceptable state for a (gerrit->differential) transition period. It isn't when the transition was declined. One can argue a lot about whether gerrit or differential fits our purpose better (which was done) - but I hardly can imagine how keeping both isn't much worse than having either of them.
tl;dr: The migration gerrit->differential being declined and gerrit being supposed to stay around, we should consider to shutdown differential and migrate existing projects from there to gerrit.

(Originally posted as T119908#4096033)

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
HappyDog added a comment.EditedApr 1 2018, 10:29 PM

I think this is a great idea. There should be clarity and consistency in how/where development takes place.

For new contributors it is difficult enough to navigate the repository fragmentation that exists in the MediaWiki ecosphere. Adding toolset fragmentation on top of that is verging on the Kafka-esque.

Differential is used by some projects.

See D1017 for a recent revision.

MediaWiki development should occur on a centralized place like Gerrit, but if other projects want to use Phabricator, there isn't really a lot of inconvenience.

Furthermore, Differential allows to create diffs not linked to a repository, so you can review a patch for an upstream project or for anything else.

Depends what you mean by 'different projects'. If they include core extensions, or tightly-coupled things like Parsoid, then it adds up to a lot of inconvenience!

(Actually, not so much inconvenience - more confusion, exsasperation and demotivation.)

If you mean internal WMF dev-ops stuff that only WMF employees care about, then perhaps that matters less (though it still seems like avoidable fragmentation which unnecessarily raises the barrier when moving from volunteer → employee).

I think this should be discussed in a wider scope of what WMF and MW should support, For example we currently have WMF teams using GitHub as their canonical source and gerrit as a mirror. I'm not against having support for multiple development environments, But we should have standards on "What should be in A compared to B" before we turn stuff off and/or on.

Dalba added a subscriber: Dalba.Apr 2 2018, 1:32 AM
Izno added a subscriber: Izno.Apr 2 2018, 3:01 AM

I agree. We should have development centralized in one place, at least for Wikimedia extensions, tools and dependencies. Current status is bad as EddieGP mentions.

demon added a subscriber: demon.Apr 2 2018, 3:07 PM

I'm fine with moving Scap back to Gerrit tbh.

mmodell added a subscriber: mmodell.Apr 2 2018, 6:26 PM
TerraCodes added a comment.EditedApr 3 2018, 7:52 AM

Wouldn't this break links like D1017 where they link to differential?

Ltrlg added a subscriber: Ltrlg.Apr 3 2018, 11:15 AM

I might have expressed myself badly by saying "disable". The proposal here is to stop reviewing code on differential. It isn't to technically disable differential by changing phabricators settings. Whether to keep differential technically enabled for legacy reasons is an detail I don't care about, at least not until a decision was made whether to continue or stop differential use at all.

greg changed the task status from Open to Stalled.Apr 3 2018, 6:17 PM
greg triaged this task as Low priority.
greg added subscribers: bd808, greg.

Brain dump of facts/where things are:

  • There is no push to get more people on Differential. That stopped long ago.
  • There is no plan to forcibly migrate users off of Differential to Gerrit.
  • Keeping Differential enabled and usable by the few people who use it is not a major maintenance burden for now. That may change as our CI infrastructure goes through evolutions.
  • There are a few projects (notably scap and blubber, which are owned by RelEng) planning to move to Gerrit in the short-term.
  • We need to take into account Toolforge users who want one-click creation of a repository for their tools. This is currently available for them via the Striker tool (cc @bd808) and creates a repo in Diffusion (with which they could do code-review via Differential). There are a number of such repos now but I'm unsure of how used the feature is.

As such, for now, we don't need to make a decision on this task with any urgency. That may change in the future (see CI above) but doing so will require a bit more thinking, planning, and actual work to do it right (see Toolforge users). That means it will need to be prioritized appropriately (and to be honest, right now RelEng already has an over-full plate of items to take on that we've already committed to for the next fiscal year annual plan).

bd808 added a comment.Apr 3 2018, 9:20 PM
  • We need to take into account Toolforge users who want one-click creation of a repository for their tools. This is currently available for them via the Striker tool (cc @bd808) and creates a repo in Diffusion (with which they could do code-review via Differential). There are a number of such repos now but I'm unsure of how used the feature is.

The backing database in Striker shows 145 repos that have been created for 138 tool accounts via the self-service tool. It would be theoretically possible to switch the app to generate Gerrit repos, but I do not have an estimate about how much work this would take or when it could be done.

We should have development centralized in one place, at least for Wikimedia extensions, tools and dependencies.

I'm not sure that I agree with this. In the Wikimedia movement, "tools" means a really broad range of things. Trying to fit all work by all people into a single workflow seems counterproductive to a goal of mine which is to make technical contributions easier. There is a lot to be said in favor of homogeneous systems, but I am not personally convinced that those benefits outweigh the loss of freedom that a single system implies. I would however love to see a better system for mirroring/aggregating git repos to a central place where they can be browsed and searched. That's an entirely different problem however.

I'm not sure that I agree with this. In the Wikimedia movement, "tools" means a really broad range of things. Trying to fit all work by all people into a single workflow seems counterproductive to a goal of mine which is to make technical contributions easier. There is a lot to be said in favor of homogeneous systems, but I am not personally convinced that those benefits outweigh the loss of freedom that a single system implies. I would however love to see a better system for mirroring/aggregating git repos to a central place where they can be browsed and searched. That's an entirely different problem however.

Well said. I too think that freedom should outweigh consistency. Strictly dictating the tools that developers use will only serve to alienate potential contributors rather than inviting them.

^^, we should keep differential open for users who want to use it.

Trying to fit all work by all people into a single workflow seems counterproductive to a goal of mine which is to make technical contributions easier

and

Strictly dictating the tools that developers use will only serve to alienate potential contributors rather than inviting them.

Interestingly, that's exactly the reason I was putting forward to using a single consistent tool-chain (lowering the barrier to entry). Having a single workflow to learn is surely easier than having several?

Do I take it from this that in practice most users work in silos, i.e. they work on one project, with an optimised tool-chain for that work, and tend not to get involved in other areas? If so, is that considered a good thing, or a bad thing? Do people stay in their silos because they don't want to have to learn another tool or is it that the nature of the work is so different that you wouldn't expect much cross-over anyway?

Not trying to be antagonsitic - just trying to understand what's going on, as it appears that we have the same aim and yet we have come to opposite conclusions!

One advantage of using all the same thing is that I don't have to try to figure out where something is because I know which site it's on.

In an ideal world, it'd be possible to contribute to scap through github, send patches for restbase to gerrit, and upload a puppet patch to differential (and exchange any of $project or $tool in that sentence with whatever you like best and have it still work). This isn't that world. If I want to contribute to puppet, I'm enforced to learn and use gerrit. If I want to upload something for scap, I'm enforced to learn and use differential. If I want to change restbase, I'm enforced to have a github account and learn githubs workflow.
So we are always enforcing tools upon developers, the difference being whether the maintainers or casual contributors are the ones who are forced. Right now, we tend to give the freedom of choice to the maintainers and force the toolchain upon the casual developers. To allow use of differential exactly fits that scheme: It gives another option to choose from to maintainers (good) and is yet another tool casual contributers might be forced to learn (bad).
I still see your point though. Completely dropping differential might be too drastic of a change in the direction of forcing the tools upon the maintainers.

I think the current state is demotivating for new contributors. But HappyDog had a very good question there: Maybe most new contributors want to work on a single project and stay with that one, so they will only have to get familar with whatever that project uses anyway.
This whole proposal is indeed based on my (unconscious at the time) assumption that a lot of new developers want try which projects they like best or work on different projects over time and thus experiece the differing toolsets used as a barrier. That's what I experienced, but maybe the vast majority of new developers doesn't share that experience (but then again, maybe they do - I'm really unsure about that one right now).

The optimal path forward really would be to find a system that allows to contribute to all projects through gerrit, without enforcing the main contributors of any project to abandon differential. I have a hard time coming up with a solution for that though. I thought about having everything mirrored to gerrit and having the maintainer review contributions of casual contributors there. When the change is ready for merge the maintainer would abandon it in gerrit and push it to the primary repo, whereever that is. However, this doesn't integrate with CI at all, still forces the maintainer to use gerrit to review casual contributors patches and creates additional work for the maintainer, so I don't really like that either.

bd808 added a comment.Apr 3 2018, 11:28 PM

Do I take it from this that in practice most users work in silos, i.e. they work on one project, with an optimised tool-chain for that work, and tend not to get involved in other areas? If so, is that considered a good thing, or a bad thing? Do people stay in their silos because they don't want to have to learn another tool or is it that the nature of the work is so different that you wouldn't expect much cross-over anyway?

Not trying to be antagonsitic - just trying to understand what's going on, as it appears that we have the same aim and yet we have come to opposite conclusions!

The space of "all the things" is really, really big.

This isn't even attempting to count all of the volunteer developed tools that live in other github orgs, bitbucket, and other version control systems. I am not at all against an attempt to reasonably consolidate "official" VCS hosting options. I am against a broad interpretation of the pull quote I took earlier from T191182#4096637: We should have development centralized in one place, at least for Wikimedia extensions, tools and dependencies.. I very much consider volunteer projects in the form of bots, webservices, and gadgets to be "Wikimedia tools". I do not think there is a good reason to try and force everything into Gerrit (or diffusion, or github, or gitlab, or ...).

There are a very small number of "production" software projects that are hosted in Diffusion as far as I know: scap, blubber, iegreview, maybe a few others. I have no objection to moving those to Gerrit if the primary maintainers are ok with that. Trying to force any project to move to Gerrit from a system that works for them seems like a less than useful endeavor.

demon added a comment.Apr 4 2018, 3:07 AM

Remember, Differential != Diffusion. There's a pretty obvious tail when it comes to Differential. I ran the numbers the other day and it was Scap, Blubber...then quickly drops off. Most repos have no differential revisions.

Re Striker stuff....If authors are just pushing their code directly, that's cool too. I see no reason to force people into workflows. We should decline this task outright.

The Github number is too high. There's a number of disabled repositories on Gerrit that we never cleaned up on Github I'd wager.

I think Scap and Blubber belong on Gerrit as they'd be more likely to receive causal contribution from other deployers there.

demon closed this task as Declined.Apr 4 2018, 3:07 AM

It's a huge PITA to maintain correct developer docs when an org has no canonical places. "Code review might happen on Gerrit but for some projects on GitHub but for some projects on Differential - good luck finding your way." I'd love to see less systems used for similar purposes instead of having added another one to the mix.

"Code review might happen on Gerrit but for some projects on GitHub but for some projects on Differential - good luck finding your way."

I very much agree with that. Freedom vs easing contributing. Your choice I guess, but you won't increase the contributing rate if you keep developers running across different code-review platforms. Wikimedia extensions and MediaWiki tools and dependencies should all be in one place (nb: tools != Toolforge; I don't care where tool people host their code).

(nb: tools != Toolforge; I don't care where tool people host their code).

Why are toolforge tools different in this respect? Only because they aren't mediawiki? Most of them wouldn't be very useful without mediawiki, they are a part of the ecosystem.

(Note that I'm just playing devil's advocate here, really)

MarcoAurelio added a comment.EditedApr 4 2018, 10:11 AM

Toolforge tools are accesories to MediaWiki and coding from non-maintainers is less frequent than mediawiki and its extensions. I also feel Toolforge tools are somewhat team managed. So if you want to join the team, you adapt to their procedures. But talking about MediaWiki and its extensions I think we will very much benefit from uniformity and standardization wrt code-review as much as possible, including using a single platform (be it Gerrit, GitHub or Differential; the point is using one place). Of course, if there is a strong reason why an specific thing is better managed elsewhere we should not enforce uniformity there; but current scenario looks bad to me (and to other developers as well apparently). That said, I am not extremelly concerned about this. My main area of contributing is MediaWiki and extensions and those all happen in one place so for me everything is okay. I was thinking on newcomers mainly, willing to contribute and not knowing where their contributions should go. Thanks.

WRT to numbers: Gerrit is our main repository, Diffusion repositories are mostly mirrors (query those not mirroring and the number would be lower I think) and as for GitHub, they're all mirrors except some of them, and the numbers are high because in the past IIRC everything was mirrored there (we should query which ones ain't mirrors either).

Regards.

HappyDog added a comment.EditedApr 4 2018, 4:27 PM

The way I see it, there are three areas of interest (which overlap slightly):

  1. You are interested in contributing to the MediaWiki software. This includes the core, the bundled or widely used extensions, skins, etc. and anything relating to the process of developing MediaWiki.
  2. You are interested in contributing to one or more of the WMF projects on a technical level. This includes the WMF-specific extensions, server ops, internal tools, etc. and may also cover certain aspects of #1.
  3. You are a user of MediaWiki and have developed tools, extensions, etc. to satisfy your own needs (or those of a client/friend/etc.) and are making those tools available to other people.

I don't think we care what tools are used for category #3. However, within category #1 and category #2, it would be helpful if you only had one suite of tools to learn. And as category #2 overlaps with category #1, it would be helpful if this suite were the same for both categories.

We should have development centralized in one place, at least for Wikimedia extensions, tools and dependencies.

I'm not sure that I agree with this. In the Wikimedia movement, "tools" means a really broad range of things.

and

Why are toolforge tools different in this respect? Only because they aren't mediawiki? Most of them wouldn't be very useful without mediawiki, they are a part of the ecosystem.

I hope this clarifies the distinction, in my mind at least, and clarifies why 'Wikimedia extensions, tools and dependencies' != 'wikimedia movement' (which to me covers all three categories).

Aklapper reopened this task as Open.Apr 6 2018, 1:01 PM

I explained in T191182#4103647 why I think this is wrong. So I am reopening this task. If Differential has no future in Wikimedia development then projects should get migrated and it should be turned off. Leaving yet another lingering system around just adds to technical debt, confusion, fragmentation.

I've received a pull request for a tool forge tool on GitHub. Honestly, I forgot this tool source code were there and I suspect it was there because Gerrit doesn't have by repo deploy keys. So I concur if fragmentation venues can be confusing for maintainers, it can really be annoying for contributors.

demon removed a subscriber: demon.Apr 11 2018, 6:46 PM

If you mean internal WMF dev-ops stuff that only WMF employees care about, then perhaps that matters less

Not really, it still matters for us as well. I also would like to have to use only 1 system and not multiple ones. If something is hosted on differential that makes it less likely for me to want to upload a quick typo-fix kind of patch.

I explained in T191182#4103647 why I think this is wrong. So I am reopening this task. If Differential has no future in Wikimedia development then projects should get migrated and it should be turned off. Leaving yet another lingering system around just adds to technical debt, confusion, fragmentation.

Would turning off Differential loose the content it already has and break links to it?

I explained in T191182#4103647 why I think this is wrong. So I am reopening this task. If Differential has no future in Wikimedia development then projects should get migrated and it should be turned off. Leaving yet another lingering system around just adds to technical debt, confusion, fragmentation.

Would turning off Differential loose the content it already has and break links to it?

Yes, "turning off" in the sense of clicking "Uninstall" in the settings would break links and hide the content (no more publicly accessable type of lost - but not deleted). I agree that doing so is a bad idea, and I assume we wouldn't do that. The proposal here is to stop using differential. There's probably other ways (maybe one can revoke the right of creating new diffs?) to achieve that without breaking what's already there.

HappyDog renamed this task from Consider disabling differential to Stop using Differential for code review.Jun 13 2018, 8:28 AM

The proposal here is to stop using differential.

This has come up a couple of times - I've updated the ticket title, to hopefully avoid further confusion.

The monthly Phab summary emails on wikitech-l@ show that we've had {20,23,19,22,19,28,29,30,19,26,24,24} Differential users in the last 12 months.
I'm clueless how to get a complete list of projects using Differential though.

I guess SELECT r.name FROM phabricator_repository.repository r JOIN phabricator_differential.differential_revision d WHERE r.phid = d.repositoryPHID AND FROM_UNIXTIME(d.dateModified,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m'); should give us a list of repos that used Differential in the last calendar month, if anyone feels like running that.

mysql:phuser@m3-slave.eqiad.wmnet [(none)]> SELECT r.name FROM phabricator_repository.repository r JOIN phabricator_differential.differential_revision d WHERE r.phid = d.repositoryPHID AND FROM_UNIXTIME(d.dateModified,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m');
+------------------------+
| name                   |
+------------------------+
| Differential Testing   |
| Scap                   |
| tool-quickstatements   |
| Scap                   |
| tool-quickstatements   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| Scap                   |
| phabricator-extensions |
| tool-quickstatements   |
| Scap                   |
| Blubber                |
| tool-quickstatements   |
| Blubber                |
| phabricator-extensions |
| Blubber                |
| Blubber                |
| Blubber                |
| Blubber                |
| Blubber                |
| Blubber                |
| wikimedia-iegreview    |
| Blubber                |
+------------------------+
34 rows in set (0.02 sec)

As for Scap there's T191373: Move Scap development from Differential to Gerrit.

As for the others, I haven't looked if there's a proposed migration task.

@Dzahn could you re-run the query to see if there are any other repos (excepting Toolforge ones) we've missed using Differential?

@Legoktm here you go

MariaDB [(none)]> SELECT r.name FROM phabricator_repository.repository r JOIN phabricator_differential.differential_revision d WHERE r.phid = d.repositoryPHID AND FROM_UNIXTIME(d.dateModified,'%Y%m')=date_format(NOW() - INTERVAL 1 MONTH,'%Y%m');
+--------------------------------+
| name                           |
+--------------------------------+
| MediaWiki Release Tools        |
| Scap                           |
| Scap                           |
| Metrolook                      |
| Scap                           |
| Phabricator                    |
| Scap                           |
| Scap                           |
| Scap                           |
| Scap                           |
| tool-ws-google-ocr             |
| Wikibase Data Model JavaScript |
| Scap                           |
| tool-texbot                    |
| wikimedia-iegreview            |
| wikimedia-iegreview            |
+--------------------------------+
16 rows in set (0.01 sec)

The way i got this: 2 terminal windows, in one ssh to phab1001.eqiad.wmnet ; mysql -u phuser -p -h m3-slave.eqiad.wmnet -P 3323 . and in the other: ssh puppetmaster1001.eqiad.wmnet ; view /srv/private/modules/passwords/manifests/init.pp and look for $app_pass in class passwords::mysql::phabricator and paste that password to login. Then run the query above.

hashar added a subscriber: hashar.Feb 7 2019, 9:23 AM

Apparently all repositories have been migrated. I guess we can now sunset Differential/disable it somehow?

We need to migrate phabricator source code to gerrit (including phabricator/deployment).

bd808 added a comment.Feb 7 2019, 6:19 PM

Apparently all repositories have been migrated. I guess we can now sunset Differential/disable it somehow?

Wikimedia Foundation production deployed software may no longer be dependent on Differential (or almost per T191182#4934766), but we do still have a use case for Differential/Diffusion in Toolforge. It is quite likely that Diffusion is the important part here with Differential being a little used add-on.

Today Striker (https://toolsadmin.wikimedia.org/) makes creating new Diffusion repos a 1-click process for any Toolforge tool. When that is the way that a tool chooses to version and publish its source then Differential is the only code review system available. I very deliberately did not choose to build the integration with Gerrit instead of Diffusion for 2 reasons. One of those reasons is now moot; namely that Diffusion is no longer intended as the global replacement for Gerrit. The other reason however is that the Gerrit/Zuul/Jenkins workflow is complex for new users to understand. It is a powerful system, but not something that we have invested in making easy. We instead invest in making it a good way to support the very specific needs of the movement's most sensitive software. This is not a bad thing, but I do not believe this is a one size fits all problem space. Fragmentation makes discovery and by extension reuse more difficult, but there are portions of our technical community, like Toolforge, where we are much earlier in the software development maturity cycle and added complexity in adoption of basic recommended practices will inhibit adoption.

The other reason however is that the Gerrit/Zuul/Jenkins workflow is complex for new users to understand. It is a powerful system, but not something that we have invested in making easy. We instead invest in making it a good way to support the very specific needs of the movement's most sensitive software. This is not a bad thing, but I do not believe this is a one size fits all problem space. Fragmentation makes discovery and by extension reuse more difficult, but there are portions of our technical community, like Toolforge, where we are much earlier in the software development maturity cycle and added complexity in adoption of basic recommended practices will inhibit adoption.

I think we can have toolforge Gerrit repos created by striker be open push by default, so they're just like any other git host where you just push. Users who want to do pre-merge code review with CI, etc. are free to do so, but otherwise I think we can get a very similar user experience that Diffusion currently has, but on Gerrit.

Probably this should be split to a separate task? 1) Allowing Striker to create Gerrit repos and 2) Migrating existing Striker repos to Gerrit.

As a tool author, I can confirm that the only reason I use Diffusion for hosting my tools’ source code and recommend it to others in cookiecutter-toolforge is how Striker makes it completely frictionless to create new repositories. On the other hand, personally I intensely dislike Differential for contributions both as a patch author and as a tool maintainer, and recently decided to mirror my tools to GitHub as well. If Striker was changed to create open-push Gerrit repositories instead of Diffusion repositories (well, I suppose at least for a while it would have to be “in addition to”, not “instead of”, to ease the transition?), I would gladly migrate to that.

Requesting a Gerrit repo is just a matter of asking at https://www.mediawiki.org/wiki/Gerrit/New_repositories/Requests. QChris and myself handle most of them in due time (more him than me). Perhaps Striker could be configured to create standard labs/tools/$toolname repos with standard settings though, but that'd also require creating a group (ie: tool-$toolname) gerrit group to grant them ownership of the repo mentioned before. A discussion about migrating the creation request page is happening at https://www.mediawiki.org/wiki/Talk:Gerrit/New_repositories/Requests.

Requesting a Gerrit repo is just a matter of asking at https://www.mediawiki.org/wiki/Gerrit/New_repositories/Requests. QChris and myself handle most of them in due time

I don’t doubt that (and thank you!), but to me that’s still a higher barrier than clicking a button in Striker :)

We could do something similar with Gerrit. Roughly:

  • create the git repository under a Gerrit namespace such as labs/tools/ or wmcs/toolforge or whatever name really
  • create a Gerrit group named upon the repository (eg labs-tools-Foobar-owner or wmcs-toolforge-Foobar-owner ) which contains the user that created the repository
  • add that group as the project owner which should grant all permissions on the repo, including push? If not we could enable push right to project owners on the parent repo and that right will be inherited to the other.

Of course, that requires some amount of work to happen on Striker side which is some decent amount of work.

@bd808 has a good point about the CI configuration. Our system is more geared toward supporting the production deployed software and protect production. It is not that self serve and far from being a one click configuration. That being, we made it slightly easier to configure, albeit it still require a patch to be send (or a task to be filled) + a manual deployment.