Page MenuHomePhabricator

Automate WMF wiki creation
Open, Needs TriagePublic

Description

Wiki creation is quite an involved process, documented on wikitech. I think, at least for certain common cases, the task could be almost completely automated.

For uncomplicated creation of new language editions under existing projects, with default configuration, the following tasks need to be done, none of which require complex human decision-making:

  • Reconfigure many services by pushing configuration changes to Gerrit, and deploy those commits
    • mediawiki-config: wikiversions, *.dblist
    • WikimediaMessages
    • DNS
    • RESTBase
    • Parsoid
    • Analytics refinery
    • cxserver
    • Labs dnsrecursor
  • Run addWiki.php. This script aims to automate all tasks which can be executed with the privileges of a MW maintenance script.
  • Run Wikidata's populateSitesTable.php. It should probably be incorporated into addWiki.php.
  • Run labsdb maintain-views
  • Update wikistats labs

So at a minimum, you need to write and deploy commits to 8 different projects, run three scripts, and manually insert some rows into a DB in a labs instance.

Despite there being no human decision making in this process, the documentation requires that you involve people from approximately four different teams (services, ops, wikidata, analytics).

In my opinion, something is going wrong here in terms of development policy. The problem is getting progressively worse. In July 2004, I fully automated wiki creation and provided a web interface allowing people to create wikis. Now, it is unthinkable.

Obviously services are the main culprits. Is it possible for in-house services to follow pybal's example, by polling a central HTTP configuration service for their wiki lists? As with pybal, the service could just be a collection of static files on a webserver etcd. Even MediaWiki could profitably use such a central service for its dblists, with APC caching.

So let's suppose we could get the procedure down to:

  1. Commit/review/deploy the DNS update
  2. Commit/review/deploy a configuration change to the new central config service.
  3. Run addWiki.php

Labs instances needing to know about the change would either poll the config service, or be notified by addWiki.php. WikimediaMessages could be updated in advance via translatewiki.net.

(Thanks to Milos Rancic for raising this issue with me.)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 22 2017, 3:51 AM
Reedy awarded a token.Feb 22 2017, 9:17 AM
aude added a subscriber: aude.Feb 22 2017, 10:38 AM

Change 339144 had a related patch set uploaded (by Reedy):
Run populateSitesTable.php on other wikidata client wikis

https://gerrit.wikimedia.org/r/339144

Ebe123 added a subscriber: Ebe123.Feb 22 2017, 5:14 PM
Krinkle added a subscriber: Krinkle.Mar 1 2017, 9:25 PM
daniel added a subscriber: daniel.Mar 1 2017, 9:43 PM

Much of this should be doable with our regular config management system. With puppet the deployment part however is not that easy to control and automate. Kubernetes might improve the situation in that regard.

Paladox added a subscriber: Paladox.Mar 2 2017, 8:55 AM

Probably better to use etcd than a separate web host, since that seems to be the current standard solution. See above related task and also T149617: Integrating MediaWiki (and other services) with dynamic configuration

tstarling updated the task description. (Show Details)Mar 3 2017, 4:13 AM
Joe added a subscriber: Joe.Mar 3 2017, 6:58 AM

@tstarling I agree, dblists is one of the things that could be stored in etcd and read from there. On the other hand, it's such a simple and relatively stable list that we could also decide to maintain this as a simple configuration file that we distribute across the cluster in a standard format, and we expect every application to read from disk.

Say we create on every node /etc/wmf/dblists.yaml (just a random name/format) which contains all the info that we need for each and every application, and then all apps read it and are able to autoconfigure themselves based on those values.

I think that a "rolling restart of applications to pick up the new config" is an acceptable step here (ops need to be involved anyways).

Legoktm added a subscriber: Legoktm.Mar 7 2017, 2:31 AM

Aren't dblists already in a standard format (newline delimited plain text) that we distribute across the cluster via scap?

demon added a comment.Mar 7 2017, 2:55 AM

Aren't dblists already in a standard format (newline delimited plain text) that we distribute across the cluster via scap?

Yes, but scap only sends them to MW-related hosts. If we moved them to something like etcd or /etc/wmf/dblists.yaml like suggested above, every application would have this data. This could be useful for services that don't need to know/speakin MediaWiki or have its code but want to know a list of all wikis it needs to care about.

demon added a comment.Mar 7 2017, 2:56 AM

Aren't dblists already in a standard format (newline delimited plain text) that we distribute across the cluster via scap?

Yes, but scap only sends them to MW-related hosts. If we moved them to something like etcd or /etc/wmf/dblists.yaml like suggested above, every application would have this data. This could be useful for services that don't need to know/speakin MediaWiki or have its code but want to know a list of all wikis it needs to care about.

Also: unless you have scap/multiversion on your system as well, the format for doing dblist math (all - something, etc) isn't available to you and you have to replicate the logic. A standard distribution/format for this avoids that issue.

The point is that updating dblists via gerrit and running scap is one of the avoidable steps in the task description. I imagine etcd would have structured data about each wiki, and the canonical map from domain name to wiki ID. To figure out exactly what structured data should be in there, we need to survey all the services in my list above, but for mediawiki-config it is dblist membership (e.g. $wikiTags in CommonSettings.php line 165) and wikiversions.json.

I think that a "rolling restart of applications to pick up the new config" is an acceptable step here (ops need to be involved anyways).

I don't think there should be any intelligence involved in the technical process of creating a wiki. I'm not sure what you mean by "a rolling restart of all services" -- if you mean stopping each service and starting it again, then I suspect that would require a human to consider the consequences.

Looking at Parsoid for a case study, I see that it re-reads sitematrix.json on worker startup, and service-runner responds to SIGHUP by doing a rolling restart of local workers. So all we need is a way to replace sitematrix.json and send SIGHUP. Other service-runner users could be reconfigured similarly. If we can have a button labelled "send SIGHUP all services", and a brainless server monkey is allowed to press it at any time, then I guess that would be a solution. But ideally, the brainless server monkey would be replaced by a line in a script.

Joe added a comment.Mar 7 2017, 5:33 AM

The point is that updating dblists via gerrit and running scap is one of the avoidable steps in the task description. I imagine etcd would have structured data about each wiki, and the canonical map from domain name to wiki ID. To figure out exactly what structured data should be in there, we need to survey all the services in my list above, but for mediawiki-config it is dblist membership (e.g. $wikiTags in CommonSettings.php line 165) and wikiversions.json.

So my idea was to transform wikilist in this structured data, and store it to disc on every machine that might need it via puppet. As I said before, it should be easy enough to distribute a changed list that way.

My point about not storing this info in etcd is that we try to use etcd to manage dynamic state, not static configurations that will change a few times a year at most.

But either that or a file distributed via puppet to every relevant machine is ok anyways.

I think that a "rolling restart of applications to pick up the new config" is an acceptable step here (ops need to be involved anyways).

I don't think there should be any intelligence involved in the technical process of creating a wiki. I'm not sure what you mean by "a rolling restart of all services" -- if you mean stopping each service and starting it again, then I suspect that would require a human to consider the consequences.

Well human supervision is useful, but I'd expect the process to be as simple as doing a scap deploy. Ops are building a distributed execution framework (https://github.com/wikimedia/cumin) that seems like a perfect candidate for this role.

Looking at Parsoid for a case study, I see that it re-reads sitematrix.json on worker startup, and service-runner responds to SIGHUP by doing a rolling restart of local workers. So all we need is a way to replace sitematrix.json and send SIGHUP. Other service-runner users could be reconfigured similarly. If we can have a button labelled "send SIGHUP all services", and a brainless server monkey is allowed to press it at any time, then I guess that would be a solution. But ideally, the brainless server monkey would be replaced by a line in a script.

The idea would be "do a controlled rolling restart (or send a SIGHUP, depending on the software) of these services", and yes it should be a line in a script.

jhsoby added a subscriber: jhsoby.Jun 18 2017, 2:04 AM
Restricted Application added a subscriber: PokestarFan. · View Herald TranscriptAug 3 2017, 10:49 AM
Zache added a subscriber: Zache.Apr 18 2019, 8:11 AM
Meno25 added a subscriber: Meno25.Apr 30 2019, 11:56 AM
Yupik added a subscriber: Yupik.May 17 2019, 7:44 PM