Earlier today, the 600,000th commit was pushed to Wikimedia's Gerrit server. We thought we'd take this moment to reflect on the developer services we offer and our community of developers, be they Wikimedia staff, third party workers, or volunteers.
At Wikimedia, we currently use a self-hosted installation of Gerrit to provide code review workflow management, and code hosting and browsing. We adopted this in 2011–12, replacing Apache Subversion.
Within Gerrit, we host several thousand repositories of code (2,441 as of today). This includes MediaWiki itself, plus all the many hundreds of extensions and skins people have created for use with MediaWiki. Approximately 90% of the MediaWiki extensions we host are not used by Wikimedia, only by third parties. We also host key Wikimedia server configuration repositories like puppet or site config, build artefacts like vetted docker images for production services or local .deb build repos for software we use like etherpad-lite, ancillary software like our special database exporting orchestration tool for dumps.wikimedia.org, and dozens of other uses.
Gerrit is not just (or even primarily) a code hosting service, but a code review workflow tool. Per the Wikimedia code review policy, all MediaWiki code heading to production should go through separate development and code review for security, performance, quality, and community reasons. Reviewers are required to use their "good judgement and careful action", which is a heavy burden, because "[m]erging a change to the MediaWiki core or an extension deployed by Wikimedia is a big deal". Gerrit helps them do this, providing clear views of what is changing, supporting itemised, character-level, file-level, or commit-level feedback and revision, and allowing series of complex changes to be chained together across multiple repositories, and ensuring that forthcoming and merged changes are visible to product owners, development teams, and other interested parties.
Across all of repositories, we average over 200 human commits a day, though activity levels vary widely. Some repositories have dozens of patches a week (MediaWiki itself gets almost 20 patches a day; puppet gets nearly 30), whereas others get a patch every few years. There are over 8,000 accounts registered with Gerrit, although activity is not distributed uniformly throughout that cohort.
To focus engineer time where it's needed, a fair amount of low-risk development work is automated. This happens in both creating patches and also, in some cases, merging them.
For example, for many years we have partnered with TranslateWiki.net's volunteer community to translate and maintain MediaWiki interfaces in hundreds of languages. Exports of translators' updates are pushed and merged automatically by one of the TWN team each day, helping our users keep a fresh, usable system whatever their preferred language.
Another key area is LibraryUpgrader, a custom tool to automatically upgrade the libraries we use for continuous integration across hundreds of repositories, allowing us to make improvements and increase standards without a single central breaking change. Indeed, the 600,000th commit was one of these automatic commits, upgrading the version of the mediawiki-codesniffer tool in the GroupsSidebar extension to the latest version, ensuring it is written following the latest Wikimedia coding conventions for PHP.
Right now, we're working on upgrading our installation of Gerrit, moving from our old version based on the 2.x branch through 2.16 to 3.1, which will mean a new user interface and other user-facing changes, as well as improvements behind the scenes. More on those changes will be coming in later posts.
Header image: A vehicle used to transport miners to and from the mine face by 'undergrounddarkride', used under CC-BY-2.0.