Automating wiki creating has been desired for a long time:
- T158730: Automate WMF wiki creation (in particular, see (especially T158730#5527658)
- T228745: Allow creating an independent "incubator wiki" instead of hosting all new wikis in one Incubator wiki with prefixes
This has been discussed a bit at the Tech Conf in the T234641: Wikimedia Technical Conference 2019 Session: Continuous Delivery/Deployment in Wikimedia: The Future of the Deployment Pipeline session, but we need a follow-up.
One thing that we can resolve right in this unconference session is T238158: Identify which parts of the "Add a wiki" procedure can be integrated with the deployment pipeline. If we have for anything more, it would be even better.
Wikimedia Technical Conference
Atlanta, GA USA
November 12 - 15, 2019
Notes: Tyler, AKosiaris
Session Name / Topic
New Wiki creation
- * T158730 -- tim starling created
- * T228745 -- amir created
- * Neither of them is possible, instead we have https://wikitech.wikimedia.org/wiki/Add_a_wiki -- many of these steps are unstable -- everytime we create a wiki this gets updated
- * 15 wikis were created this year -- so this is fairly regular -- Amir would like to move wikis out of the incubator as quickly as possible
- Amir Sarabadani recently fixed a number of bugs that made this process work again, but it still is a mess. Extremely unstable
- Up to 20 wikis per year up to now, Amir would like up to dozens a year.
- How do we make this more stable?
- Tangentially related to the deployment pipeline
- Addshore: who here has added a wiki?
- * Reedy + Daniel raised hands.*
- Alex: how much do we want to automate this? A button?
- Greg: Tim's suggestion on the task was 3 or 4 steps, which I think is good
- Amir: if a language is elligble, then a wiki has been created -- 3 or 4 people who can write pages and is unique language -- this can be turned into a button, but not everyone should be able to click this button. The language commitee should be able to click this button -- that'd be fine.
- Joe: Given that a lot of modifications in production need to happen, even if it's automated, I think that button should be clicked by someone who could follow these tasks. Every step in this task is basically adding configuration to a bunch of services. Some of this can be automated by the addWiki.php script. The script adds tables, etc. But we need to coordinate creation across different parts of the infra.
- Reedy: it shouldn't need *n* different stuff
- Problems with the ways we treat the configuration that needs to be updated to add a new wiki
- Timo: for cxserver and for restbase -- they just have a copy of the sitematrix -- that sounds like an api-call they can cache and pull every minute
- Joe: if an application needs to know which wikis exist and which don't that should come from mediawiki as the canonical service
- Daniel: that's the dns langlist for each wiki.
- TImo: Idon' t think DNS should be the source of truth
- Addshore: what I made for wmde doesn't touch parsoid, but from my point of view you shouldn't need a static list
- Timo: restbase uses the wiki you're interacting with as a pathname component -- sinc we don't want to tuse third party hostnames
- And still internally does, even if the external API is <lang>.<wikiproject>.org
- G: this is about propogation of changes
- Timo: two categories (1) setup steps, actually needs to do something (2) cxserver/restbase -- list of valid hostnames -- not about validation or actually doing something which should be a pull
- R: addwiki is not tested or testable and only gets run every 6 months
- A: It works only in production
- R: in beta ... kind of
- Addshore: we have wikis that go into the world and have things broken
- G: this needs CI and it needs to be idempotent -- series of stages each of which are scripts that do two things: 1. check state of world 2. if state of the world has not bee updated, update it -- we need to re-engineer that script. We need to wrap it in a structure that is more resilient
- maybe shouldn't be one script?
- R: prep it down, split it to various sections
- G: Does it need mediawiki libraries?
- R: Yes. CirrusSearch, frex which calls to CirrusSearch libraries
- A: none of the wiki creation in my script is inside MW, it's all seperate. Anything that the setup script needs to do with MW it does via private APIs to keep is serperate
Review of addWiki.php script
- Creates database
- Creates database for Echo
- Creates the main page (which should be idempotent) -- could be an api call
- Set fundraising link (fundraising tech hasn't used this for at least 7 years) -- adds a link to the sidebar (it's probably already there, and not really needed)
- Creates search index
- Populates global sites table -- which breaks frequently -- more than addWiki.php
- Cognate sites table
- G: if most of this is applying sql or creating an index in elasticsearch -- we have a framework to automate this in steps
- Timo: since this is a maintenance script, it needs a mediawiki instance running, but this script can be independent
- R: from the dbname it's creating it figures out the type of wiki that's being created
- A: there are another 5 steps before we run addWiki
- R: but that might be out of scope: do you want us automating dns deploys?
- G: if the problem is dns and apache
- R: for 90% of cases languagelist does a lot of the work
- D: this is for languages that don't have any wiki yet
- G: several things need to updated and we have no reliable way to know if the script is broken or not
- A: Who owns this?
- Noone currently, but in practice: Amir (Ladsgroup)+Sam+Urbanecm? (And language committee drives this from the users' perspective: Jan-Harald Søby, Amir Aharoni, etc.)
- A: Who *should* own this?
- G: Several teams, but someone should drive.
- T: it can be made robust/easier to maintain by making it do less -- services should use mediawiki as the source of truth
- G: improvements can be made incrementally
- D: the problem is all the different repositories involved
- G: split into steps, make sure those steps are testable
- R: should be easy and doable. Do we want to have a CI process that creates a wiki ?
- T: the solution is not to automate it but to make it obsolete
everyone is shocked by wikicreation
- Greg: this seems like it's going to take a lot of cross-team work, which means that it's hard to get prioritized
- Joe: it's part of the strategy to reach the global south
- Reedy: it's not time consuming, but it's the people wrangling and fixing it when it breaks
- Greg: also bus factor
- [ACTION] Amir to make a list of reasons why this makes sense as a project
- related tickets:
- Make creating a new Language project easier: https://phabricator.wikimedia.org/T165585
- Automate wiki creation: https://phabricator.wikimedia.org/T158730
- WMTC 2019 session (this): https://phabricator.wikimedia.org/T235520
A list by @Krinkle