Page MenuHomePhabricator

Create parsoid mediawiki deployment and migrate parsoid-php.discovery.wmnet traffic to it
Closed, ResolvedPublic

Description

Based on mediawiki helm chart in order to migrate the traffic that makes it to the current parsoid cluster to MW-on-K8s

Status

The deployment has been created, successfully deployed to and is now ready to start receiving traffic for parsoid-php.discovery.wmnet

Plan is:

  • 1 host per DC for a couple of days
  • 3 hosts per DC for a couple of more days
  • 7 hosts per DC for a few more days
  • 8 hosts for a few hours (we are going to be pretty confident here)
  • Everything Else, aka 100%
  • Declare it done. Cleanups tracked at T359387

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+1 -1
operations/puppetproduction+15 -0
operations/deployment-chartsmaster+2 -2
operations/puppetproduction+14 -0
operations/puppetproduction+8 -0
operations/deployment-chartsmaster+2 -2
operations/puppetproduction+4 -0
operations/puppetproduction+2 -0
operations/mediawiki-configmaster+3 -0
operations/puppetproduction+10 -0
operations/puppetproduction+0 -49
operations/deployment-chartsmaster+6 -0
operations/puppetproduction+49 -0
operations/puppetproduction+2 -2
operations/dnsmaster+7 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+36 -0
operations/puppetproduction+1 -0
operations/puppetproduction+6 -0
operations/deployment-chartsmaster+100 -0
operations/deployment-chartsmaster+24 -0
operations/puppetproduction+4 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Seems fine, most parsoid traffic is probably happening on the main cluster these days anyway.
We just want to make sure that scandium doesn't get trampled on, since we still use this for pre-deployment testing.

Also worth keeping in mind: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/965608 -- in order to avoid inadventently getting external non-WMF clients for the internal Parsoid REST API, we currently expose that API only in the Parsoid cluster. If/when we move away from the Parsoid cluster we'll need to use some other mechanism to protect those endpoints from external use.

Change 1004138 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/dns@master] Add mw-parsoid

https://gerrit.wikimedia.org/r/1004138

Change 1004149 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] deploy: Add mw-parsoid namespace stanzas

https://gerrit.wikimedia.org/r/1004149

Change 1004150 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] mw-parsoid: Have deployments happening

https://gerrit.wikimedia.org/r/1004150

Change 1004151 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] conftool: Add mw-parsoid stanzas

https://gerrit.wikimedia.org/r/1004151

Change 1004152 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] service::catalog: Add mw-parsoid service

https://gerrit.wikimedia.org/r/1004152

Change 1004153 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] mw-parsoid: Add LVS backends on wikikube servers

https://gerrit.wikimedia.org/r/1004153

Change 1004154 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] mw-parsoid: Switch to lvs_setup

https://gerrit.wikimedia.org/r/1004154

Change 1004155 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] mw-parsoid: Switch to production and have it page

https://gerrit.wikimedia.org/r/1004155

Change 1004157 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/deployment-charts@master] mw-parsoid: Introduce it

https://gerrit.wikimedia.org/r/1004157

Change 1004739 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/deployment-charts@master] admin_ng: mw-parsoid stanzas

https://gerrit.wikimedia.org/r/1004739

Change 1004149 merged by Alexandros Kosiaris:

[operations/puppet@production] deploy: Add mw-parsoid namespace stanzas

https://gerrit.wikimedia.org/r/1004149

Change 1004739 merged by jenkins-bot:

[operations/deployment-charts@master] admin_ng: mw-parsoid stanzas

https://gerrit.wikimedia.org/r/1004739

Change 1004157 merged by jenkins-bot:

[operations/deployment-charts@master] mw-parsoid: Introduce it

https://gerrit.wikimedia.org/r/1004157

Change 1004150 merged by Alexandros Kosiaris:

[operations/puppet@production] mw-parsoid: Have deployments happening

https://gerrit.wikimedia.org/r/1004150

Change 1004151 merged by Alexandros Kosiaris:

[operations/puppet@production] conftool: Add mw-parsoid stanzas

https://gerrit.wikimedia.org/r/1004151

Change 1004152 merged by Alexandros Kosiaris:

[operations/puppet@production] service::catalog: Add mw-parsoid service

https://gerrit.wikimedia.org/r/1004152

Change 1004153 merged by Alexandros Kosiaris:

[operations/puppet@production] mw-parsoid: Add LVS backends on wikikube servers

https://gerrit.wikimedia.org/r/1004153

Change 1004154 merged by Alexandros Kosiaris:

[operations/puppet@production] mw-parsoid: Switch to lvs_setup

https://gerrit.wikimedia.org/r/1004154

Mentioned in SAL (#wikimedia-operations) [2024-02-21T12:01:13Z] <akosiaris> restart pybal on lvs1020 to pickup mw-parsoid service. T357392

Mentioned in SAL (#wikimedia-operations) [2024-02-21T12:02:22Z] <akosiaris> restart pybal on lvs2014 to pickup mw-parsoid service. T357392

Mentioned in SAL (#wikimedia-operations) [2024-02-21T12:10:19Z] <akosiaris> restart pybal on lvs2013, lvs 1019 to pickup mw-parsoid service. T357392

Change 1004138 merged by Alexandros Kosiaris:

[operations/dns@master] Add mw-parsoid

https://gerrit.wikimedia.org/r/1004138

Change 1004155 merged by Alexandros Kosiaris:

[operations/puppet@production] mw-parsoid: Switch to production and have it page

https://gerrit.wikimedia.org/r/1004155

akosiaris changed the task status from Open to In Progress.Feb 21 2024, 2:50 PM
akosiaris triaged this task as Medium priority.
akosiaris moved this task from Incoming 🐫 to Doing 😎 on the serviceops board.

Change 1005723 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/mediawiki-config@master] ClusterConfig: Add kube-wiki-parsoid test

https://gerrit.wikimedia.org/r/1005723

Seems fine, most parsoid traffic is probably happening on the main cluster these days anyway.
We just want to make sure that scandium doesn't get trampled on, since we still use this for pre-deployment testing.

We aren't touching scandium at all for this, so I think we are safe on that front.

Also worth keeping in mind: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/965608 -- in order to avoid inadventently getting external non-WMF clients for the internal Parsoid REST API, we currently expose that API only in the Parsoid cluster. If/when we move away from the Parsoid cluster we'll need to use some other mechanism to protect those endpoints from external use.

Good point. I 've posted https://gerrit.wikimedia.org/r/1005723 to test for this and it passes already. SERVERGROUP btw is for this deployment kube-mw-parsoid which matches both parsoid and k8s traits.

I 've also used curl -s -H "Host: en.wikipedia.org" https://mw-parsoid.discovery.wmnet:4452/wiki/Special:Version to see whether Parsoid extension is installed and as expected, it is.

Change 1005730 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] conftool: Add some kubernetes hosts to parsoid

https://gerrit.wikimedia.org/r/1005730

Change 1005730 merged by Alexandros Kosiaris:

[operations/puppet@production] conftool: Add some kubernetes hosts to parsoid

https://gerrit.wikimedia.org/r/1005730

Change 1005770 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/deployment-charts@master] mw-parsoid: Add parsoid.discovery.wmnet in cert SANs

https://gerrit.wikimedia.org/r/1005770

Change 1005770 merged by jenkins-bot:

[operations/deployment-charts@master] mw-parsoid: Add parsoid.discovery.wmnet in cert SANs

https://gerrit.wikimedia.org/r/1005770

akosiaris renamed this task from Create parsoid mediawiki deployment to Create parsoid mediawiki deployment and migrate parsoid-php.discovery.wmnet traffic to it.Feb 22 2024, 3:09 PM
akosiaris updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2024-02-22T15:12:45Z] <akosiaris> Bump weight of old parsoid hosts from 10 to 110. This is a noop right now but will makes calculations later spelled out in T357392 possible.

Mentioned in SAL (#wikimedia-operations) [2024-02-22T15:15:11Z] <akosiaris> T357392 pool 46 kubernetes hosts of parsoid-php with a weight of 1. Since the 42 parse hosts are at weight 110, that means 1% goes to mw-parsoid deployment, aka mw-on-k8s

Change 1006890 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] Revert "conftool: Add some kubernetes hosts to parsoid"

https://gerrit.wikimedia.org/r/1006890

Change 1006892 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] services_proxy: Add mw-parsoid in the mesh

https://gerrit.wikimedia.org/r/1006892

Change 1006893 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] Switch restbase1019, restbase2021 to mw-parsoid

https://gerrit.wikimedia.org/r/1006893

Change 1006894 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] Switch restbase102[01], restbase202[23] to mw-parsoid

https://gerrit.wikimedia.org/r/1006894

Change 1006895 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] Switch restbase102[2345], restbase202[4567] to mw-parsoid

https://gerrit.wikimedia.org/r/1006895

Change 1006896 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] Switch restbase102[6789], restbase103[0123], restbase202[89], restbase203[01234] to mw-parsoid

https://gerrit.wikimedia.org/r/1006896

Change 1006897 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] Switch the remaining parsoid hosts to mw-parsoid

https://gerrit.wikimedia.org/r/1006897

Change 1006898 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] restbase: Switch the default to mw-parsoid

https://gerrit.wikimedia.org/r/1006898

Change 1006899 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] Clean up all the RESTBase hosts's parsoid uri changes

https://gerrit.wikimedia.org/r/1006899

Change 1006900 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/puppet@production] services_proxy: Remove parsoid-php, parsoid-async

https://gerrit.wikimedia.org/r/1006900

The LVS traffic approach was doomed to fail, since scap utilizes the same data structure to figure out which hosts to deploy to. I 've re-ran numbers on services_proxy and parsoid cluster to make sure I ain't missing anything and it appears that indeed the only direct client is RESTBase and monitoring/healthchecks. So, the services_proxy approach should work fine. I 've updated the plan in the task and I 'll start executing it.

Change 1006890 merged by Alexandros Kosiaris:

[operations/puppet@production] Revert "conftool: Add some kubernetes hosts to parsoid"

https://gerrit.wikimedia.org/r/1006890

Change 1006892 merged by Alexandros Kosiaris:

[operations/puppet@production] services_proxy: Add mw-parsoid in the mesh

https://gerrit.wikimedia.org/r/1006892

Change 1005723 merged by jenkins-bot:

[operations/mediawiki-config@master] ClusterConfig: Add kube-wiki-parsoid test

https://gerrit.wikimedia.org/r/1005723

Change 1006893 merged by Alexandros Kosiaris:

[operations/puppet@production] Switch restbase1019, restbase2021 to mw-parsoid

https://gerrit.wikimedia.org/r/1006893

Migration started, we are batch 1 for the next few days.

Change 1006894 merged by Alexandros Kosiaris:

[operations/puppet@production] Switch restbase102[01], restbase202[23] to mw-parsoid

https://gerrit.wikimedia.org/r/1006894

Change 1008828 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/deployment-charts@master] mw-parsoid: Bump replicas

https://gerrit.wikimedia.org/r/1008828

Change 1008828 merged by jenkins-bot:

[operations/deployment-charts@master] mw-parsoid: Bump replicas

https://gerrit.wikimedia.org/r/1008828

Change 1006895 merged by Alexandros Kosiaris:

[operations/puppet@production] Switch restbase102[2345], restbase202[4567] to mw-parsoid

https://gerrit.wikimedia.org/r/1006895

Change 1006897 merged by Alexandros Kosiaris:

[operations/puppet@production] Switch more eqiad parsoid hosts to mw-parsoid

https://gerrit.wikimedia.org/r/1006897

Change 1009227 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/deployment-charts@master] mw-parsoid: replicas x2 for hopefully the last time

https://gerrit.wikimedia.org/r/1009227

Change 1009227 merged by jenkins-bot:

[operations/deployment-charts@master] mw-parsoid: replicas x2 for hopefully the last time

https://gerrit.wikimedia.org/r/1009227

Change 1006896 merged by Alexandros Kosiaris:

[operations/puppet@production] Switch restbase1026-1033, restbase20289-2034 to mw-parsoid

https://gerrit.wikimedia.org/r/1006896

Change 1006898 merged by Alexandros Kosiaris:

[operations/puppet@production] restbase: Switch the default to mw-parsoid

https://gerrit.wikimedia.org/r/1006898