Most of us are used to (and rely on) testing mediawiki itself, as well as components surrounding it (eg. memcached, envoy, etc) using the mwdebug servers. We want to replicate this functionality when we move mediawiki to kubernetes.
Update: this service is live and ready for testing on eqiad only: mw-experimental
What?
Create a separate kubernetes service or services
Requirements
- have its own metrics and logging where engineers can look when testing changes there
- probably by using its own servergroup
- engineers can deploy experimental easily (eg directly editing/copy files)
- do not alert on errors etc
- service needs to stay up to date with mediawiki images running on production
- route traffic through XWD
Proposal
Host /srv/mediawiki in predefined kubernetes nodes, and have mediawiki pods running on those nodes mount it directly via hostPath. Users interested in testing/editing files manually, can do so my simply ssh-ing to those hosts, and edit files as usual. Using XWD, the can test their code by selecting the host they are working on.
For the sake of simplicity, let's call this service mw-experimental.
Kubernetes parts
Deployment and Service
mw-experimental will be a new deployment, and a new service. Differences from mw-debug (an other mw-*):
- hostPath: /srv/mediawiki will be mounted via hostPath, overriding what is in the images
- NodePort: This will be a NodePort service
- internalTrafficPolicy: Setting it to Local will ensure that the mw-experimental pod running on a host, will be the one to serve any requests reaching it
- Affinity and/or Tolerations: specific nodes will be allowed to host a mw-experimental pod
- Code Updates
- NoScap: Mediawiki freshness of the mw code will depend on the latest mediawiki-multiversion image found in the host
- Systemd timer: a cron could take care of that, and react (ie copy files from the image)
- No need to make those hosts scap targets
- One Pod: Each eligible host, will run exactly 1 mw-experimental pod
Application Configuration
- opcache.validate_timestamps: php-fpm should have it enebled
- SERVERGROUP: distinct servergroup
- TBA?
Puppet/Infra
- VMs: Just like we have done already with kask, those nodes could be just 2 VMs per DC
- Pupper Profile: We can have a puppet profile where we define the snowflakey stuff, and use a hiera on/off switch
- Users/Deployers/Permissions: We could have a designated user group, and manage the permissions of /srv/mediawiki accordingly
- ATS/XWD
- x-wikimedia-debug-routing: will be edited accordingly, forwarding for example to wikikube-worker1001.eqiad.wmnet:4888
- We can go as far as creating a CNAME pointing to the hostnames of the k8s workers hosting mw-experimental, eventually making this fully transparent to users
Other info
Wow there, that sounds like another snowflake!
Well, this is not more of a snowflake than mwdebug servers used to be, but it is also a snowflake that if it is down or not functional, it is ok