Page MenuHomePhabricator

Deploy wikidiff2 1.14.1
Closed, ResolvedPublic

Description

I have tagged a new release of wikidiff2, made a tarball, and uploaded it to releases.wikimedia.org.

Would serviceops please assist in building a Debian package and deploying it to Wikimedia servers?

No user-visible changes are expected. The most likely failure mode is an increase in the number of segfaults from php-fpm. The package should be deployed to a small subset of MW appservers and monitored for segfaults before deploying it to the remaining servers. If there are any segfaults, it would be useful if a core dump could be collected before the deployment is reverted.

Event Timeline

I've built a package and for now copied it to https://people.wikimedia.org/~jmm/wikidiff/

If it passes basic functional tests (maybe in deploment-prep?), the next step could be to roll this out to the canaries.

One note: https://github.com/wikimedia/mediawiki-php-wikidiff2/commit/c11002017454cec638517a26af0032202a0c0868 mentions the removal of wikidiff2.ini. It's in fact a required file for dh-php, the Debhelper build integration, I've readded it as a Debian-specific patch under debian/patches. I have no idea if other distros also need/use it.

One note: https://github.com/wikimedia/mediawiki-php-wikidiff2/commit/c11002017454cec638517a26af0032202a0c0868 mentions the removal of wikidiff2.ini. It's in fact a required file for dh-php, the Debhelper build integration, I've readded it as a Debian-specific patch under debian/patches. I have no idea if other distros also need/use it.

Right, I see. Mark added it in c016fb0dacf250a7049835ce321eced70d3a9d6b , at which time we had a debian/ directory in the master branch. The debian/ directory was deleted in master in 47372f83285460b52f74433d7997a454ba3c1a66 by @Legoktm, and I assumed wikidiff2.ini was accidentally left there. It's more obvious what it's for and what it does if it's a Debian-specific patch.

One note: https://github.com/wikimedia/mediawiki-php-wikidiff2/commit/c11002017454cec638517a26af0032202a0c0868 mentions the removal of wikidiff2.ini. It's in fact a required file for dh-php, the Debhelper build integration, I've readded it as a Debian-specific patch under debian/patches. I have no idea if other distros also need/use it.

Right, I see. Mark added it in c016fb0dacf250a7049835ce321eced70d3a9d6b , at which time we had a debian/ directory in the master branch. The debian/ directory was deleted in master in 47372f83285460b52f74433d7997a454ba3c1a66 by @Legoktm, and I assumed wikidiff2.ini was accidentally left there. It's more obvious what it's for and what it does if it's a Debian-specific patch.

Remi's RPM .spec creates its own ini file: https://git.remirepo.net/cgit/rpms/php/pecl/php-pecl-excimer.git/tree/php-pecl-excimer.spec?id=1767d7b88549a2f261d2590c8999801516649b08#n77 so yeah, it's effectively Debian-specific.

Here's how @taavi pretty trivially updated the Debian packaging: https://salsa.debian.org/mediawiki-team/wikidiff2/-/commit/d5b78c7b51abac02beeb334374cb97aeed359086

Of course, I've copied this over to all the other PHP extension repos, so let me submit patches removing it everywhere else...

TheresNoTime renamed this task from Deploy wikidiff2 1.14.0 to Deploy wikidiff2 1.14.1.Jul 10 2023, 2:10 PM

JFTR, since I'm away for two weeks: When the tests are complete, the 1.14.1 packages can be imported in reprepro from /home/jmm/import on apt1001.wikimedia.org:

reprepro -C component/php74 include buster-wikimedia wikidiff2_1.14.1-0+wmf1+buster1_amd64.changes
TheresNoTime changed the task status from Open to Stalled.Jul 12 2023, 11:22 AM

JFTR, since I'm away for two weeks: When the tests are complete, the 1.14.1 packages can be imported in reprepro from /home/jmm/import on apt1001.wikimedia.org:

reprepro -C component/php74 include buster-wikimedia wikidiff2_1.14.1-0+wmf1+buster1_amd64.changes

Thank you :-) we're fairly confident in the state of 1.14.1 on beta, but we're just waiting on the go-ahead from our QA — will update here once we're ready!

TheresNoTime changed the task status from Stalled to Open.Jul 12 2023, 11:42 AM

QA OK'd, we can proceed with the next steps of deployment (afaik, this is deploying 1.14.1 to some canary servers?)

akosiaris subscribed.

Looking quickly at mw-canaries and mwdebug, they all have 1.13.0-1+wmf1+buster1

akosiaris@cumin1001:~$ sudo cumin 'A:mw-canary' 'apt policy php7.4-wikidiff2'
21 hosts will be targeted:
mw[2271-2272,2374,2376].codfw.wmnet,mw[1414-1418,1447-1450].eqiad.wmnet,mwdebug[2001-2002].codfw.wmnet,mwdebug[1001-1002].eqiad.wmnet,parse[2001-2002].codfw.wmnet,parse[1001,1003].eqiad.wmnet
OK to proceed on 21 hosts? Enter the number of affected hosts to confirm or "q" to quit: 21
===== NODE GROUP =====                                                                                                                                                                                                                                                                      
(21) mw[2271-2272,2374,2376].codfw.wmnet,mw[1414-1418,1447-1450].eqiad.wmnet,mwdebug[2001-2002].codfw.wmnet,mwdebug[1001-1002].eqiad.wmnet,parse[2001-2002].codfw.wmnet,parse[1001,1003].eqiad.wmnet                                                                                        
----- OUTPUT of 'apt policy php7.4-wikidiff2' -----                                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                                                            
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.                                                                                                                                                                                                             

php7.4-wikidiff2:
  Installed: 1.13.0-1+wmf1+buster1
  Candidate: 1.14.1-0+wmf1+buster1
  Version table:
     1.14.1-0+wmf1+buster1 1001
       1001 http://apt.wikimedia.org/wikimedia buster-wikimedia/component/php74 amd64 Packages
 *** 1.13.0-1+wmf1+buster1 100
        100 /var/lib/dpkg/status

I 've crafted the debdeploy spec and ...

sudo debdeploy deploy -u 2023-07-12-wikidiff2.yaml -s mw-canary
Rolling out wikidiff2:
Library update, several services might need to be restarted

php7.4-wikidiff2 was updated: 1.13.0-1+wmf1+buster1 -> 1.14.1-0+wmf1+buster1
  mw[2271-2272,2374,2376].codfw.wmnet,mw[1414-1418,1447-1450].eqiad.wm
net,mwdebug[2001-2002].codfw.wmnet,mwdebug[1001-1002].eqiad.wmnet,pars
e[2001-2002].codfw.wmnet,parse[1001,1003].eqiad.wmnet (21 hosts)

I 've checked, php-fpm has also been restarted and should be using the new version of wikidiff. Let's let it sit for a few days now.

@TheresNoTime let me know when we should proceed with the next step of the deployment. Which should be across the whole fleet, unless you have a different idea.

Mentioned in SAL (#wikimedia-operations) [2023-07-12T12:30:21Z] <akosiaris> upgrade wikidiff2 1.13.0-1+wmf1+buster1 -> 1.14.1-0+wmf1+buster1 on mw-canary hosts T340087

Mentioned in SAL (#wikimedia-operations) [2023-07-12T12:52:26Z] <moritzm> imported wikidiff2 1.14.1-0+wmf1+buster1+icu67u1 to component/icu67 T340087 T329491

JFTR; I've also rebuilt/uploaded wikidiff 1.41.1 for component/icu67 (so that we don't regress when the ICU67 migration starts)

JFTR; I've also rebuilt/uploaded wikidiff 1.41.1 for component/icu67 (so that we don't regress when the ICU67 migration starts)

Ah, perfect, thanks!

[...]
@TheresNoTime let me know when we should proceed with the next step of the deployment. Which should be across the whole fleet, unless you have a different idea.

Just to check, does the deployment to the whole fleet require a deployment window? It's sat on the canaries with no issues for about a week now, so we could probably start thinking about that full deployment..

[...]
@TheresNoTime let me know when we should proceed with the next step of the deployment. Which should be across the whole fleet, unless you have a different idea.

Just to check, does the deployment to the whole fleet require a deployment window? It's sat on the canaries with no issues for about a week now, so we could probably start thinking about that full deployment..

We never did that in the past as far as I can remember. It won't hurt of course if we do.

+1 on the full deployment. What would be your preferred date?

[...]
@TheresNoTime let me know when we should proceed with the next step of the deployment. Which should be across the whole fleet, unless you have a different idea.

Just to check, does the deployment to the whole fleet require a deployment window? It's sat on the canaries with no issues for about a week now, so we could probably start thinking about that full deployment..

We never did that in the past as far as I can remember. It won't hurt of course if we do.

+1 on the full deployment. What would be your preferred date?

This week would be ideal, any date/time that works for y'all :-)

Mentioned in SAL (#wikimedia-operations) [2023-07-25T11:24:17Z] <akosiaris> T340087 starting wikidiff2 1.41.1 rollout to codfw

Mentioned in SAL (#wikimedia-operations) [2023-07-25T11:25:08Z] <akosiaris> T340087 keep a copy php-wikidiff2_1.13.0-1_amd64.deb in apt1001:/home/akosiaris/wd/ in case of emergency

Mentioned in SAL (#wikimedia-operations) [2023-07-25T11:29:59Z] <akosiaris> T340087 starting wikidiff2 1.41.1 rollout to eqiad. codfw already done.

Mentioned in SAL (#wikimedia-operations) [2023-07-25T11:32:17Z] <akosiaris> T340087 wikidiff2 rollout done. 1 host is unreachable and will need to be reimaged or upgraded manually to pick this up, parse1002.eqiad.wmnet

Clement_Goubert subscribed.

Mentioned in SAL (#wikimedia-operations) [2023-07-25T11:32:17Z] <akosiaris> T340087 wikidiff2 rollout done. 1 host is unreachable and will need to be reimaged or upgraded manually to pick this up, parse1002.eqiad.wmnet

parse1002 is currently broken, see T339340: hw troubleshooting: CPU machine check failure for parse1002.eqiad.wmnet

Change 941761 had a related patch set uploaded (by Alexandros Kosiaris; author: Alexandros Kosiaris):

[operations/docker-images/production-images@master] Rebuild for T340087, aka wikidiff2 1.14.1 deployment

https://gerrit.wikimedia.org/r/941761

Change 941761 merged by Alexandros Kosiaris:

[operations/docker-images/production-images@master] Rebuild for T340087, aka wikidiff2 1.14.1 deployment

https://gerrit.wikimedia.org/r/941761

php7.4-fpm-multiversion-base rebuilt as well, should make it out to mw-on-k8s in the next deployments. I think we can resolve this now. Feel free to reopn.

There were some hosts still on 1.13 (cloudweb, mwmaint, deployment servers, scandium, snapshot) and parse1002 (which was down during the initial deployment), I've also upgraded them now.

There were some hosts still on 1.13 (cloudweb, mwmaint, deployment servers, scandium, snapshot) and parse1002 (which was down during the initial deployment), I've also upgraded them now.

Thanks for catching it, I completely forgot to update wikidiff on parse1002 when putting it back in the pool.