Page MenuHomePhabricator

1.38.0-wmf.13 deployment blockers
Closed, ResolvedPublic5 Estimated Story PointsRelease

Details

Backup Train Conductor
dancy
Release Version
1.38.0-wmf.13
Release Date
Dec 13 2021, 12:00 AM

2021 week 50 1.38-wmf.13 Changes wmf/1.38.0-wmf.13

This MediaWiki Train Deployment is scheduled for the week of Monday, December 13th:

Monday December 13thTuesday, December 14thWednesday, December 15thThursday, December 16thFriday
Backports only.Branch wmf.13 and deploy to Group 0 Wikis.Deploy wmf.13 to Group 1 Wikis.Deploy wmf.13 to all Wikis.No deployments on fridays

How this works

  • Any serious bugs affecting wmf.13 should be added as subtasks beneath this one.
  • Any open subtask(s) block the train from moving forward. This means no further deployments until the blockers are resolved.
  • If something is serious enough to warrant a rollback then you should bring it to the attention of deployers on the #wikimedia-operations IRC channel.
  • If you have a risky change in this week's train add a comment to this task using the Risky patch template
  • For more info about deployment blockers, see Holding the train.

Related Links

Other Deployments

Previous: 1.38.0-wmf.12
Next: 1.38.0-wmf.14

Event Timeline

thcipriani changed Release Date from Dec 7 2020, 12:00 AM to Dec 7 2021, 12:00 AM.Oct 21 2021, 12:31 AM
thcipriani changed Release Date from Dec 7 2021, 12:00 AM to Dec 13 2021, 12:00 AM.Oct 21 2021, 12:41 AM
thcipriani triaged this task as Medium priority.
thcipriani updated Other Assignee, added: dancy.
thcipriani set the point value for this task to 5.
Risky Patch! 🚂🔥
  • Change: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikiEditor/+/587228 "Tag WikiEditor edits with a hidden tag" (T249038)
  • Summary:
    • A lot of edits are going to be tagged as a result of this change
    • Good old WikiEditor is probably how most edits are made (we don't know for sure how many, because we do not have any tagging for it :) )
  • Test plan:
    • We're hoping someone will notice and complain if this puts unreasonable load on the database servers, or significantly slows down edit saving, or something
    • On https://en.wikipedia.org/wiki/Special:Tags (and similar pages on other wikis), a "wikieditor" tag will be listed, along with the number of tagged changes
      image.png (2×3 px, 427 KB)
  • Places to monitor: ?
  • Revert plan: Revert patch
  • Affected wikis: all
  • IRC contact: MatmaRex, edsanders
  • UBN Task Projects/tags: Editing-team
Risky Patch! 🚂🔥
  • Change: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/666434/ "Upgrade Vue to the migration build of Vue 3"
  • Summary:
    • Upgrades Vue from version 2.6.11 to version 3.2.21
    • This comes with lots of breaking changes, but we're using the migration build of Vue 3, which provides backwards compatibility for most Vue 2 code
    • Most code expecting Vue 2 should continue to work, but the backwards compatibility isn't perfect, so something could break
    • We've tested all known Vue code with this patch, but we might have missed something
  • Test plan: Manual testing of the affected projects (the ContentTranslation, MachineVision, MediaSearch, NearbyPages, QuickSurveys and Wikibase extensions, and the search feature in the Vector skin)
  • Places to monitor:
  • Revert plan: Revert patch
  • Affected wikis: All
  • IRC contact: RoanKattouw
  • UBN Task Projects/tags: Design-Systems-team (Design Systems Team FY2021-22 Kanban Board) Vue.js

Change 746932 had a related patch set uploaded (by Ahmon Dancy; author: Ahmon Dancy):

[operations/mediawiki-config@master] all wikis to 1.38.0-wmf.12 refs T293954

https://gerrit.wikimedia.org/r/746932

Change 746932 merged by jenkins-bot:

[operations/mediawiki-config@master] all wikis to 1.38.0-wmf.12 refs T293954

https://gerrit.wikimedia.org/r/746932

Mentioned in SAL (#wikimedia-operations) [2021-12-13T18:25:18Z] <dancy@deploy1002> rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.12 refs T293954

Risky Patch! 🚂🔥
  • Change: https://gerrit.wikimedia.org/r/c/mediawiki/vendor/+/746936 - Rollout of patches to implement T261181
  • Summary:
    • See the summary at T293952#7534903. The main difference is that, in the last train, we rolled out a patch to make the code forwards compatible, in case this train needs reverting. This means that RESTBase will not need purging should the train be reverted.
  • Test plan:
    • On rollout to group 0 wikis, @ihurbain will test this extensively on testwiki.
  • Places to monitor:
  • Revert plan: Rollback train or revert patch.
  • Affected wikis:
mediawikiwiki
metawiki
commonswiki
  • IRC contact: subbu[m], ihurbain(WMF staff can find us on slack on #content-transformers and has a bigger set of eyes there)
  • UBN Task Projects/tags: Parsoid

I haven't promoted testswikis to wmf.13 since there is an unsolved blocker from last week: T297517 . I have asked on that task whether it is fine to push wmf.13 regardless.

Risky Patch! 🚂🔥

...

  • Revert plan: Rollback train or revert patch.

To clarify Arlo's "revert patch" note there, reverting Parsoid's vendor patch to move Parsoid back to v0.15.0-a12 and deploying vendor/ would do the trick. I don't know if vendor changes can be rolled out outside of a train deploy. So, if that is not possible, train rollback would be necessary.

I ran the stage-train but aborted before promoting the testwikis. I then ran scap sync-world to push all the wmf.13 code and warm it up on the machines. All wikis are still at wmf.12 though.

I am not available for the test of the evening, but anyone should be able to promote testwikis to wmf.13.

I am going to promote testwikis to wmf.13:

  • there are two blockers which seem fixed on beta cluster (master branch) and have their patches included in wmf.13. One can then confirm the fix via test.wikipedia.org

The real blocker is the memory leak, Tyler wrote a nice summary at T297517#7570257 . Switching group 0 to wmf.13 is not going to cause a drop of traffic that would hide the leak, so it sounds safe to move forward today.

edit: testwikis -> group 0

Change 747200 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/mediawiki-config@master] group0 wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747200

Change 747200 merged by jenkins-bot:

[operations/mediawiki-config@master] group0 wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747200

Mentioned in SAL (#wikimedia-operations) [2021-12-14T20:16:50Z] <hashar@deploy1002> rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.13 refs T293954

Mentioned in SAL (#wikimedia-operations) [2021-12-14T20:28:31Z] <hashar> group0 wikis (eg mediawiki.org) are unavailable due to a deployment issue. We are working on it # T293954

I broke group 0 due to not bumping testwiki to wmf.13 when I ran scap sync-world earlier today. That means the l10n cache for wmf.13 did not get build and when I update wikiversions for group0 the wikis got switched but without any l10n cache. Lesson learned: I should follow the process to the letter and thus should have promoted testwikis.

I could not rollback cause scap commands failed due to the lack of wmf.13 messages. @Urbanecm fixed it by using:

sudo -u mwdeploy cp /srv/mediawiki-staging/wikiversions.json /srv/mediawiki/wikiversions.json
scap wikiversions-compile
cp /srv/mediawiki/wikiversions.php /srv/mediawiki-staging/wikiversions.php
scap sync-file --force wikiversions.php 'rollback group0'`

Change 747204 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/mediawiki-config@master] group0 wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747204

Change 747204 merged by jenkins-bot:

[operations/mediawiki-config@master] group0 wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747204

Mentioned in SAL (#wikimedia-operations) [2021-12-14T21:18:14Z] <hashar@deploy1002> rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.13 refs T293954

The two small blockers T297529 and T297421 are solved.

T297517 is actively being tracked. The issue was made more prominent last week (T297669) and got worked around. A patch to php seems to fix the leak and we can process with group 1 in a few hours as planned.

Change 747601 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/mediawiki-config@master] group1 wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747601

Change 747601 merged by jenkins-bot:

[operations/mediawiki-config@master] group1 wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747601

Mentioned in SAL (#wikimedia-operations) [2021-12-15T20:03:16Z] <hashar@deploy1002> rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.13 refs T293954

Mentioned in SAL (#wikimedia-operations) [2021-12-15T20:04:22Z] <hashar@deploy1002> Synchronized php: group1 wikis to 1.38.0-wmf.13 refs T293954 (duration: 01m 05s)

Change 747606 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/mediawiki-config@master] Revert \"group1 wikis to 1.38.0-wmf.13 refs T293954\"

https://gerrit.wikimedia.org/r/747606

Mentioned in SAL (#wikimedia-operations) [2021-12-15T20:46:11Z] <hashar@deploy1002> rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.38.0-wmf.13 refs T293954

Change 747606 merged by jenkins-bot:

[operations/mediawiki-config@master] Revert \"group1 wikis to 1.38.0-wmf.13 refs T293954\"

https://gerrit.wikimedia.org/r/747606

MediaWiki has been rolled back from group 1 wiki since newcomers were unable to login on cawiki: T297827

Change 747887 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/mediawiki-config@master] group1 wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747887

Change 747887 merged by jenkins-bot:

[operations/mediawiki-config@master] group1 wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747887

Mentioned in SAL (#wikimedia-operations) [2021-12-16T18:03:58Z] <hashar@deploy1002> rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.13 refs T293954

Mentioned in SAL (#wikimedia-operations) [2021-12-16T18:05:05Z] <hashar@deploy1002> Synchronized php: group1 wikis to 1.38.0-wmf.13 refs T293954 (duration: 01m 05s)

Change 747905 had a related patch set uploaded (by Hashar; author: Hashar):

[operations/mediawiki-config@master] all wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747905

Change 747905 merged by jenkins-bot:

[operations/mediawiki-config@master] all wikis to 1.38.0-wmf.13 refs T293954

https://gerrit.wikimedia.org/r/747905

Mentioned in SAL (#wikimedia-operations) [2021-12-16T20:19:35Z] <hashar@deploy1002> rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.13 refs T293954

Looks like a success. For the remaining blockers:

  • T297828 is fixed, pending verification on production
  • T297517 is more or less related to code pushed last week which surfaced a bug in PHP, that one is being worked on
Risky Patch! 🚂🔥

This has resulted in slow queries on enwiki (T298225), it took a while before the problems started… It's mitigated for now, see that task for details.