Page MenuHomePhabricator

Define a mediawiki "version"
Open, MediumPublic

Description

Currently we have no way of knowing if a mediawiki server is running the code it should. Moreover, we sometimes run into the issue where if during a scap run a server is unavailable, this server will need a manual deployment. What makes is problem so complex is the fact that when we roll out our changes (the weekly MediaWiki train), we pull from ~190 different repositories. This becomes even more complex if we take into account security hotfixes.

The main goal is to figure out ways we can know at any given time, which servers are on version x and which on version x+1. Useful information:

  • On each mw server, scap creates a directory called /srv/mediawiki/php/cache/gitinfo where it holds information from every repo we pull from.

Event Timeline

jijiki triaged this task as Medium priority.Mar 15 2019, 4:48 PM
jijiki created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 15 2019, 4:48 PM
Dzahn added a subscriber: Dzahn.Mar 15 2019, 4:48 PM
Dzahn added a comment.Mar 15 2019, 4:58 PM

How about this:

We calculate the sha1 sum of each file in /srv/mediawiki/php/cache/gitinfo and then the sha1 sum of the sum of these, like so:

sha1sum /srv/mediawiki/php/cache/gitinfo/* | sha1sum

I tried this on mwdebug* via cumin and get:

[cumin1001:~] $ sudo cumin 'mwdebug*' 'sha1sum /srv/mediawiki/php/cache/gitinfo/* | sha1sum'
4 hosts will be targeted:
mwdebug[2001-2002].codfw.wmnet,mwdebug[1001-1002].eqiad.wmnet
Confirm to continue [y/n]? y
===== NODE GROUP =====                                                                                                                                 
(3) mwdebug[2001-2002].codfw.wmnet,mwdebug1001.eqiad.wmnet                                                                                             
----- OUTPUT of 'sha1sum /srv/med...info/* | sha1sum' -----                                                                                            
4081e6c73bbddae3e21824e99007c9b219890b66  -                                                                                                            
===== NODE GROUP =====                                                                                                                                 
(1) mwdebug1002.eqiad.wmnet                                                                                                                            
----- OUTPUT of 'sha1sum /srv/med...info/* | sha1sum' -----                                                                                            
25df3a0f0fb9395a30ee3eb6c596641da63270bf  -

From this we see that mwdebug1001, 2001 and 2002 are at identical versions while 1002 is different.

We could run this on mw* next and see.

Joe moved this task from Backlog to Externally Blocked on the serviceops board.

it would be nice if the version information could be broken down in some way so that a small change to one file didn't radically alter the version. A hash of all file contents is unfortunately very chaotic and hard to reason about.. I'm thinking of something like what you get from git describe

Dzahn added a comment.Feb 10 2020, 8:03 PM

The problem i see with that is how do you define what a "small" change is? It feels like when we call things "trivial" but then there are unexpected changes anyways. The very nature of them being unexpected means they would all be called small or trivial before the fact.

I think it would actually be desirable that 2 different versions of code can't be called the same MediaWiki version, no matter how small the change. Different is different. Making it a matter of "use good judgement" seems to negate part of the point to have a definitive version.

Kind of have a hard time imagining a situation where we actually want one or multiple appservers to serve something that is "just a little" different from all others but not completely. Not sure i understand why it's considered chaotic to have a single hash. Isn't it more chaotic to have subtle differences?

How about this:

We calculate the sha1 sum of each file in /srv/mediawiki/php/cache/gitinfo and then the sha1 sum of the sum of these, like so:

The gitinfo cache is calculated on each sync and is publicly disclosed, so it doesn't accurately represent the current deployed state; i.e., with security patches or with uncommitted changes; e.g., when we roll forward *only* testwiki on Tuesday, that's a local-only change, usually.

it would be nice if the version information could be broken down in some way so that a small change to one file didn't radically alter the version. A hash of all file contents is unfortunately very chaotic and hard to reason about.. I'm thinking of something like what you get from git describe

Seems like wikiversions could probably act as a marker (i.e., like a tag in git describe); i.e., wmf.19-wmf.18-[checksum]

Here's a proposal:

  1. Find all wikiversions in use
  2. join them with '_'
  3. checksum everything under /srv/medaiwiki-staging
  4. Output that "version" to /srv/mediawiki-staging/VERSION before sync.
#!/bin/bash

WV=/srv/mediawiki-staging/wikiversions

# Use labs if we're in labs
if [[ $(cat /etc/wikimedia-cluster) == 'labs' ]]; then
        WV="${WV}-labs"
fi

# Find all uniq wikiversions in wikiversions(-labs).json
VERSION=$(jq -r '. | to_entries | unique_by(.value)[] | .value' "${WV}.json" | tr '\n' '_')

# Everything directory under /srv/mediawiki-staging should be in a git repo
# Not all of the repos are submodules of one-another, so the top-level SHA1 isn't enough information, we need to find all .git directories.
# there are other .git files here which are the submodule state and should be represented by the SHA1 of the parent repo
CHECKSUM=$(while read -r gitdir; do
        dir="$(dirname "$gitdir")"

        # Git the SHA1 of HEAD + the diff for uncommitted changes
        printf '%b\n%b\n' $(git -C "$dir" rev-parse --verify HEAD) $(git -C "$dir" diff --raw --unified=0)
done < <(find /srv/mediawiki-staging/ -type d -name '*.git') | shasum -a 256 | awk '{print $1}')

# all the wikviersions in use + the first 8 digits of the sha256 of the repo states
printf '%b%b\n' "$VERSION" "${CHECKSUM:0:8}"

Output on beta is like: php-master_c8b58da0.

The idea would be to put something like the above into scap as a step before a sync, that way we have a version number that changes for every deployment. Checking that a server is up-to-date would then be [ -z "$(diff <(curl deployment.host/mediawiki/mediawiki/VERSION) /srv/mediawiki/VERSION)" ]. Similar to @hnowlan 's change to check wikiversions.json (https://gerrit.wikimedia.org/r/566708/)

Thoughts?

Kind of have a hard time imagining a situation where we actually want one or multiple appservers to serve something that is "just a little" different from all others but not completely. Not sure i understand why it's considered chaotic to have a single hash. Isn't it more chaotic to have subtle differences?

I wasn't suggesting that we have subtly different versions in prod on purpose, just that it'd be nice if I could look at two versions and tell how different they are and what changed. I know that's asking for a lot especially if the version string is meant to be compact.