Page MenuHomePhabricator

Create new GitHub mirrored repository with core + all extensions + all skins
Open, Needs TriagePublic

Description

We currently mirror a bunch (all?) of extension and skin repositories, plus MediaWiki core, from gerrit to GitHub.

One nice thing about GitHub is the semantic code navigation capabilities, see documentation

That allows you to do something like, view https://github.com/wikimedia/mediawiki/blob/master/index.php#L28 and hover over wfEntryPointCheck() to see its usages and quickly navigate to where it's defined:

In the GitLab consultation, several people have expressed that they use GitHub for navigating code repositories since Gerrit's gittiles is kind of obscure to find and/or unintuitive to navigate.

My proposal would be to, in addition to our current mirroring, create a new repository on GitHub (e.g. wikimedia/mediawiki-all-skins-all-extensions) that contains a mirror of the latest core + all skins + all extensions, so that finding usages of a function call can be done from within the code browser of that repository. Even if we move to GitLab, we won't have the ability to do cross-repository searches due to limitations in the community edition of the software, so having this setup in GitHub seems like a useful thing to do.

Event Timeline

kostajh created this task.Sep 30 2020, 10:08 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 30 2020, 10:08 AM

There's https://github.com/wikimedia/mediawiki-extensions and https://github.com/wikimedia/mediawiki-skins where we just check out the submodules. This is used by several tools to fetch and retrieve some information. I am not sure this is a start point.

Gerrit pushes updates to GitHub using the replication plugin. So we'll need to figure out how to get that enormous core+extensions+skins repo in Gerrit created and updating itself each time there's a commit to either core, extensions or skins, and then in theory the plugin will take care of replicating that to GitHub.

There's https://github.com/wikimedia/mediawiki-extensions and https://github.com/wikimedia/mediawiki-skins where we just check out the submodules. This is used by several tools to fetch and retrieve some information. I am not sure this is a start point.

Gerrit pushes updates to GitHub using the replication plugin. So we'll need to figure out how to get that enormous core+extensions+skins repo in Gerrit created and updating itself each time there's a commit to either core, extensions or skins, and then in theory the plugin will take care of replicating that to GitHub.

Hmm, yeah I don't think the submodule would allow for codebrowsing/search.

Anyway, since we can't meaningfully combine the history from all these different sources, one idea would be to use a cron operation (I assume GitHub actions lets us do this) that iterates over a list of extensions/skins and copies over the latest code into the repo, then commits. It could run daily and probably be a useful enough resource. With that approach, we wouldn't have to do anything on the Gerrit side, just a (hopefully) relatively simple script that lives in the monolithic GitHub repo.

I made a proof of concept at https://github.com/kostajh/mediawiki-all-extensions-all-skins. It seems like GitHub still needs some time to index the 1.4 GB of files :) so code navigation doesn't seem to work yet. It will update every 24 hours with the latest code for all the extensions/skins defined at https://www.mediawiki.org/w/api.php?action=query&format=json&formatversion=2&list=extdistrepos (h/t to the codesearch tool for pointing to that). If people find this repo useful we could move it to the github.com/wikimedia group and publicize its existence.

Tgr awarded a token.Oct 2 2020, 12:22 AM
Tgr added a subscriber: Tgr.
Peachey88 moved this task from Backlog to make on the Wikimedia-GitHub board.Sun, Nov 22, 4:41 AM