There should be very few repos not mastered in gerrit, and for each one we should have a good, on-going reason why we do things "wrongly". Rather than let things bob along and hope we fix things, it'd be nice to have an exhaustive list of GitHub-mastered repos so e.g. we can fix all the npm audit issues appropriately.
|Stalled||None||T218529 Investigate indexing Issues and PRs of Github repositories canonical on wikimedia.biterg.io|
|Stalled||None||T218528 Sort out a process to add new Github repositories to wikimedia.biterg.io Git indexing|
|Open||mmodell||T235874 Create some automation for management of gerrit/phabricator/github mirrors|
|Open||None||T237470 Create and maintain somehow a list of repos mastered in GitHub (and in Phabricator Diffusion)|
- Mentioned In
- T249703: Automatically close Pull Requests in repos mirrored on Github
T241659: Gather data how much code development activity takes place canonically in Phabricator Diffusion
- Mentioned Here
- T109939: For mirrored GitHub repositories, actually give the canonical source Gerrit URL in the repo description
T114616: Review and update GitHub mirror repo descriptions with standard text
T136863: Should Wikimedia have standard policies for managing github mirror repos?
T218528: Sort out a process to add new Github repositories to wikimedia.biterg.io Git indexing
Github had a feature to declare a repository has being a mirror but it can no more be used. There is a hint at https://help.github.com/en/github/getting-started-with-github/finding-open-source-projects-on-github#open-source-projects-with-mirrors-on-github and an example is chromium/chromium:
mirrored from https://chromium.googlesource.com/chromium/src
Github search documentation mentions [[ https://help.github.com/en/github/searching-for-information-on-github/searching-for-repositories#search-based-on-whether-a-repository-is-a-mirror | search qualifier mirror:true ]], which potentially would let us differentiate between mirror and mastered repositories.
Unfortunately there does not seem to be a way for us to flag a repository has being a mirror.
We also do not quite keep the mirrored repositories in sync with what is in Gerrit. For the replication to occur we need to manually create the project on github and when a project is deleted in Gerrit it is not necessarily deleted from GitHub, though lot of deletions are now handled via the Projects-Cleanup workflow.
I guess we could have a script listing all repositories on Gerrit and Github and do a diff. There are a few cases to deal with, on top of my mind:
- GitHub renames such as mediawiki/core in Gerrit mapping to mediawiki-core in Github which got renamed to simply mediawiki ( https://github.com/wikimedia/mediawiki )
- Repositories deleted in Gerrit but not deleted in GitHub. Though the project description on GitHub should have a line stating it is a mirror from Gerrit
Maybe we could instead replicate Gerrit repositories to a standalone GitHub organization dedicated to mirroring? It would be closed down to just the few people managing Gerrit / github-mirrors with master repositories staying under the current wikimedia organization.
This is basically the inverted version of T109939: For mirrored GitHub repositories, actually give the canonical source Gerrit URL in the repo description and related to T114616: Review and update GitHub mirror repo descriptions with standard text / T136863: Should Wikimedia have standard policies for managing github mirror repos?.
Solving this blocks T218528: Sort out a process to add new Github repositories to wikimedia.biterg.io Git indexing (which has some non-sustainable thoughts of mine).
It looks like it's not documented how/if it's programmatically possible for us to set our mirrored repos as mirrors on GitHub; maybe someone (Tyler?) could ask in an official capacity if it'd be possible for them to tell us what the undocumented API magic is (or… build it?). If so, we could adjust our mirroring script, and then this would be trivial.