Page MenuHomePhabricator

1.39.0-wmf.28 deployment blockers
Closed, ResolvedPublic5 Estimated Story PointsRelease

Details

Backup Train Conductor
dduvall
Release Version
1.39.0-wmf.28
Release Date
Sep 5 2022, 12:00 AM

2022 week 36 1.39-wmf.28 Changes wmf/1.39.0-wmf.28

This MediaWiki Train Deployment is scheduled for the week of Monday, September 5th:

Monday September 5thTuesday, September 6thWednesday, September 7thThursday, September 8thFriday
Backports only.Branch wmf.28 and deploy to Group 0 Wikis.Deploy wmf.28 to Group 1 Wikis.Deploy wmf.28 to all Wikis.No deployments on fridays

How this works

  • Any serious bugs affecting wmf.28 should be added as subtasks beneath this one.
  • Any open subtask(s) block the train from moving forward. This means no further deployments until the blockers are resolved.
  • If something is serious enough to warrant a rollback then you should bring it to the attention of deployers on the #wikimedia-operations IRC channel.
  • If you have a risky change in this week's train add a comment to this task using the Risky patch template
  • For more info about deployment blockers, see Holding the train.

Related Links

Other Deployments

Previous: 1.39.0-wmf.27
Next: 1.40.0-wmf.1

Event Timeline

thcipriani triaged this task as Medium priority.
thcipriani updated Other Assignee, added: dduvall.
thcipriani set the point value for this task to 5.
Risky Patch! 🚂🔥
all of them
  • IRC contact: ebernhardson, dcausse
  • UBN Task Projects/tags: Discovery-Search
  • Would you like to backport this change rather than ride the train?: No
Jdlrobson subscribed.

Probably worth considering the above as a blocker since there's a fix up.

Change 829875 had a related patch set uploaded (by TrainBranchBot; author: trainbranchbot):

[mediawiki/core@wmf/1.39.0-wmf.28] Branch commit for wmf/1.39.0-wmf.28

https://gerrit.wikimedia.org/r/829875

Change 829875 merged by jenkins-bot:

[mediawiki/core@wmf/1.39.0-wmf.28] Branch commit for wmf/1.39.0-wmf.28

https://gerrit.wikimedia.org/r/829875

Change 829879 had a related patch set uploaded (by TrainBranchBot; author: MediaWiki PreSync):

[operations/mediawiki-config@master] testwikis wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/829879

Change 829879 merged by jenkins-bot:

[operations/mediawiki-config@master] testwikis wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/829879

Mentioned in SAL (#wikimedia-operations) [2022-09-06T03:02:32Z] <mwpresync@deploy1002> Started scap: testwikis wikis to 1.39.0-wmf.28 refs T314189

Mentioned in SAL (#wikimedia-operations) [2022-09-06T03:38:49Z] <mwpresync@deploy1002> Finished scap: testwikis wikis to 1.39.0-wmf.28 refs T314189 (duration: 36m 17s)

Change 830091 had a related patch set uploaded (by TrainBranchBot; author: Jaime Nuche):

[operations/mediawiki-config@master] group0 wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/830091

Change 830091 merged by jenkins-bot:

[operations/mediawiki-config@master] group0 wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/830091

Mentioned in SAL (#wikimedia-operations) [2022-09-06T08:09:25Z] <jnuche@deploy1002> rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.28 refs T314189

We are getting 1.3M/day warnings for bad use of table name in building queries in RC. https://logstash.wikimedia.org/goto/e0e96107196477c7c3c1ca5854ee585e

It's probably something introduced before, or turned on this week but this needs fixing.

Change 830585 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] rdbms: Remove useless warning

https://gerrit.wikimedia.org/r/830585

I don't know if you'd consider T317187: GrowthExperiments Special:Homepage: investigate performance regression since September 6 2022 to be a deployment blocker. It would be helpful if other teams could say whether they're observing performance degradation for features they're monitoring, to understand if it's related to wmf.28 changes, or PHP 7.4 rollout, or multi-DC rollout work. The ElasticSearch patches (T314189#8200929) are responsible for part of the increased response times for Special:Homepage, as noted in T317187#8217380, and I don't think there's any way around that.

There is a general increase in every performance metric it seems: https://performance.wikimedia.org/#!/week

image.png (660×941 px, 53 KB)

We probably need to look at flame graphs. php 7.4 usually shouldn't cause this, it's demonstrably faster.

There is a general increase in every performance metric it seems: https://performance.wikimedia.org/#!/week

image.png (660×941 px, 53 KB)

We probably need to look at flame graphs. php 7.4 usually shouldn't cause this, it's demonstrably faster.

Blocking the train until we have some answers on this

There is a general increase in every performance metric it seems: https://performance.wikimedia.org/#!/week

image.png (660×941 px, 53 KB)

We probably need to look at flame graphs. php 7.4 usually shouldn't cause this, it's demonstrably faster.

Blocking the train until we have some answers on this

Some options I can think of:

  • rollback to group0 and group1 wikis to wmf.27, to help determine if there are code changes in core and/or extensions/skins that are contributing. The ElasticSearch changes are surely a contributor to what we see on Special:Homepage but shouldn't be responsble for the overall increase in performance metrics, AIUI
  • rollback to php 7.2 for some period of time to observe what happens with https://performance.wikimedia.org/#!/week and other charts
  • rollback to stage 3 of T279664: Progressive Multi-DC roll out, in order to rule out that as a contributing factor

Some options I can think of:

  • rollback to group0 and group1 wikis to wmf.27, to help determine if there are code changes in core and/or extensions/skins that are contributing. The ElasticSearch changes are surely a contributor to what we see on Special:Homepage but shouldn't be responsble for the overall increase in performance metrics, AIUI
  • rollback to php 7.2 for some period of time to observe what happens with https://performance.wikimedia.org/#!/week and other charts
  • rollback to stage 3 of T279664: Progressive Multi-DC roll out, in order to rule out that as a contributing factor

Just for some extra information, group1 wikis haven't been rolled forward yet and are still at wmf.27

Definitely due to -fpsmax being passed to FFmpeg by TimedMediaHandler but we run an older version. I don't think the issue should block this week train, we can most probably rollback the patch and deploy it to wmf.28 or get a hotfix.

There is a general increase in every performance metric it seems: https://performance.wikimedia.org/#!/week

image.png (660×941 px, 53 KB)

We probably need to look at flame graphs. php 7.4 usually shouldn't cause this, it's demonstrably faster.

This afternoon I have investigated the bump in DomInteractive shown by that graph. Looking at navigation timing it seems it was between 7:40 and 10:00 on Sept 7th. That would match:

07:46 	<topranks> 	Depool eqsin from user traffic in advance of core router upgrades - T295690
09:48 	<topranks> 	Re-pooling eqsin for user traffic after successful core router upgrades - T295690

My theory is as we depooled eqsin, the traffic shifted to Europe / US West coast which increased the latency for people in Asia and around. If we have somewhere latency graphs per AS / region I would expect Asia to show a higher latency and everything else to haven't changed. So I think it can be dismissed.

A view for loadEventEnd at that time:

{F35513591 size=full}

Change 830926 had a related patch set uploaded (by TrainBranchBot; author: Jeena Huneidi):

[operations/mediawiki-config@master] group1 wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/830926

Change 830926 merged by jenkins-bot:

[operations/mediawiki-config@master] group1 wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/830926

Mentioned in SAL (#wikimedia-operations) [2022-09-08T19:11:25Z] <jhuneidi@deploy1002> rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.28 refs T314189

Mentioned in SAL (#wikimedia-operations) [2022-09-08T19:15:05Z] <jhuneidi@deploy1002> Synchronized php: group1 wikis to 1.39.0-wmf.28 refs T314189 (duration: 03m 39s)

Change 830929 had a related patch set uploaded (by TrainBranchBot; author: Jeena Huneidi):

[operations/mediawiki-config@master] all wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/830929

Change 830929 merged by jenkins-bot:

[operations/mediawiki-config@master] all wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/830929

Mentioned in SAL (#wikimedia-operations) [2022-09-08T19:36:24Z] <jhuneidi@deploy1002> rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.28 refs T314189

Change 830935 had a related patch set uploaded (by TrainBranchBot; author: Jeena Huneidi):

[operations/mediawiki-config@master] group2 wikis to 1.39.0-wmf.27

https://gerrit.wikimedia.org/r/830935

Change 830935 merged by jenkins-bot:

[operations/mediawiki-config@master] group2 wikis to 1.39.0-wmf.27

https://gerrit.wikimedia.org/r/830935

Mentioned in SAL (#wikimedia-operations) [2022-09-08T19:56:19Z] <jhuneidi@deploy1002> rebuilt and synchronized wikiversions files: group2 wikis to 1.39.0-wmf.27 refs T314189

Change 830944 had a related patch set uploaded (by TrainBranchBot; author: Jeena Huneidi):

[operations/mediawiki-config@master] all wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/830944

Change 830944 merged by jenkins-bot:

[operations/mediawiki-config@master] all wikis to 1.39.0-wmf.28

https://gerrit.wikimedia.org/r/830944

Mentioned in SAL (#wikimedia-operations) [2022-09-08T21:08:46Z] <jhuneidi@deploy1002> rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.28 refs T314189

Although wmf.28 is already out, the above linked task is for reverting / fixing a change that made the "@stable to override" IndexPager::makeLink unused without proper deprecation in this train. This broke Special:CheckUser's ability to page the results and will have broken any other code that overrided IndexPager::makeLink.

Change 831082 had a related patch set uploaded (by Krinkle; author: Amir Sarabadani):

[mediawiki/core@master] rdbms: Allow SubQuery objects in SelectQueryBuilder as table

https://gerrit.wikimedia.org/r/831082

Change 831082 merged by jenkins-bot:

[mediawiki/core@master] rdbms: Allow SubQuery objects in SelectQueryBuilder as table

https://gerrit.wikimedia.org/r/831082

Change 832322 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@wmf/1.40.0-wmf.1] rdbms: Allow SubQuery objects in SelectQueryBuilder as table

https://gerrit.wikimedia.org/r/832322

Change 830585 abandoned by Ladsgroup:

[mediawiki/core@master] rdbms: Remove useless warning

Reason:

Done in favor of Id9d6c1d0d4c40f43ab0d2

https://gerrit.wikimedia.org/r/830585

Change 832322 abandoned by Ladsgroup:

[mediawiki/core@wmf/1.40.0-wmf.1] rdbms: Allow SubQuery objects in SelectQueryBuilder as table

Reason:

Not needed

https://gerrit.wikimedia.org/r/832322