Page MenuHomePhabricator

Enable Wikidata, cxserver, and Parsoid on new wikis as early as possible
Open, Needs TriagePublic

Description

Currently, when new wikis are created, it takes a few more days until Wikidata, Visual Editor, and Content Translation become usable on them. (Content Translation is only relevant for new Wikipedias, but a lot of new wikis are Wikipedias.)

Is there really a reason for this? It would be much nicer to make all of these things immediately available when a new wiki is created instead of waiting for more deployments.

Event Timeline

Is there really a reason for this? It would be much nicer to make all of these things immediately available when a new wiki is created instead of waiting for more deployments.

For VE and CX, it's mostly the patches for CT and Parsoid are merged afterwards. I don't know if there's any specific reason they can't be merged before. You'd have to ask the service owners specifically

For Wikidata, the answer lies in https://wikitech.wikimedia.org/wiki/Add_a_wiki#Wikidata

That script is known to be troublesome, you might want to ask Marius (hoo) or Amir Sarabadani (Amir1) run it for you or just create a ticket (that may be done anytime after the wiki was created).
Beware: The script sometimes fails with a duplicate key conflict. In that case, go to the wiki's master database and empty the sites and site_identifiers tables, then run the script again. It's probably also wise to backup these tables from Wikidata and at least one Wikipedia before running the script across the whole fleet. Breaking the sites, site_identifiers tables will break page rendering of many wikis!

See T158751: Make populateSitesTable.php more robust and its subtask. And the confusion on T171013: Clarify populateSitesTable.php instructions in Add a Wiki

I'm getting tired of this being a major issue every time we create a new wiki (Wikidata enabled, but rarely works 100% correctly)

I can comment on the wikidata part mostly. I have done it more than a dozen times. Honestly the underlying problem is not the populateSitesTable.php breaking sometimes, the problem is that you need to run that for all 1k wikis separately and it's very likely that at least in one of wikis (usually two or three) that breaks because of db lock, network partition, you name it. Running it again fixes the issue most of the time (all the time for me) but if you don't rerun it, rendering will be completely broken for the given wiki in less than 40 minutes (when site lookup cache expires). I basically need glue my eyes to monitor for twenty minutes to make sure all outputs of the script is fine for all wikis.

The proper fix is centralizing site lookup (T113034: RFC: Overhaul Interwiki map, unify with Sites and WikiMap) that would also improves integrity of the data and running it once also easier.

cc @daniel (He probably has some ideas on how to proceed here)

It populates the sites table for the new wiki (let's say banwiki), but we need to run the exact same script for all other wikis (for a different reason, their sitelookup need to pick up banwiki and add it to their sites table rows.)

It populates the sites table for the new wiki (let's say banwiki), but we need to run the exact same script for all other wikis (for a different reason, their sitelookup need to pick up banwiki and add it to their sites table rows.)

But we're still running foreachwikiindblist for the wikidata client list anyway... So it's gonna happen for that wiki too. Which is also done in the patch I linked, which "automates" it, and makes it part of the process