Page MenuHomePhabricator

1.37.0-wmf.23 deployment blockers
Closed, ResolvedPublic5 Estimated Story PointsRelease

Details

Backup Train Conductor
hashar
Release Version
1.37.0-wmf.23
Release Date
Sep 13 2021, 12:00 AM

2021 week 37 1.37-wmf.23 Changes wmf/1.37.0-wmf.23

This MediaWiki Train Deployment is scheduled for the week of Monday, September 13th:

Monday September 13thTuesday, September 14thWednesday, September 15thThursday, September 16thFriday
Backports only.Branch wmf.23 and deploy to Group 0 Wikis.Deploy wmf.23 to Group 1 Wikis.Deploy wmf.23 to all Wikis.No deployments on fridays

How this works

  • Any serious bugs affecting wmf.23 should be added as subtasks beneath this one.
  • Any open subtask(s) block the train from moving forward. This means no further deployments until the blockers are resolved.
  • If something is serious enough to warrant a rollback then you should bring it to the attention of deployers on the #wikimedia-operations IRC channel.
  • If you have a risky change in this week's train add a comment to this task using the Risky patch template
  • For more info about deployment blockers, see Holding the train.

Related Links

Other Deployments

Previous: 1.37.0-wmf.22
Next: 1.38.0-wmf.1

Related Objects

Event Timeline

thcipriani triaged this task as Medium priority.
thcipriani updated Other Assignee, added: mmodell.
thcipriani set the point value for this task to 5.
  • Change: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/549909
  • Summary:
    • This moves the code that handles page protection out of the Title class. It's a complex refactoring or critical functionality, and the code is hit on pretty much every request, in order to perform permission checks.
  • Test plan:
    • We improved phpunit test coverage of the old code and ensured good coverage of the new code. We also wrote extensive end-to-end tests to ensure that page protection still works as expected.
  • Places to monitor:
    • It is hard to ferosee what kind of issue would be cause by this. We were very careful not to break things, but we may still see errors due to e.g. type hints that have become more strict, or performance issues if we got the caching wrong somewhere. We may also see unintended changes in user facing behavior. All of these seem unlikely, but you never know.
    • Logstash: mediawiki-errors
    • Grafana: mediawiki-errors
  • Revert plan:
    • Identify and fix specific issue
    • Revert patch. This does not touch many files, we might see some conflicts in Title though. Note that the relevant code reads and writes to WanObjectCache. The structure of cache entries should be forward and backwards compatible.
    • If all fails, rollback train.
  • Affected wikis:
    • all
  • IRC contact: Duesen, Pchelolo. But better find Daniel or Petr on the platform-engineering slack channel.
  • UBN Task Projects/tags: Platform Team Workboards (MW Expedition) Platform Engineering

FYI Italian should be a group 1 wiki as part of this train for the first time (joining Catalan and Hebrew). This could lead to a higher volume of potential errors than we see normally, by design. See T286664 for more details.

Risky Patch! 🚂🔥
  • Change: https://gerrit.wikimedia.org/r/716756
  • Summary:
    • SyntaxHighlight now shells out to pygments through the Shellbox BoxedCommand system (still on individual appservers, not in the shellbox service yet).
  • Test plan:
    • Special:Version will now show the version of pygments. Edit/purge a page with <syntaxhighlight> tags (most pages on mediawiki.org!)
  • Places to monitor:
  • Revert plan: Revert, and its child patch
  • Affected wikis: all, theoretically.
  • IRC contact: legoktm
  • UBN Task Projects/tags: SyntaxHighlight

I am the one running the train this week :-]

thcipriani updated Other Assignee, added: hashar; removed: mmodell.
thcipriani added a subscriber: hashar.

Flipping conductor and backup for group0 deployment. Let's swap back after group0.

Group0 would normally happen this week during the DC Switchover deployment freeze: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20210914T1400

@daniel and I have a weekly check in on Tuesday morning so we will be able to talk about the change to page protection. I guess we might catch a few low hanging fruits on testwiki (if not already caught via beta).

@Legoktm SyntaxHighlight / pygments going through Shellbox I guess that already got tested via the beta cluster hasn't it? Regardless there is already an example page on testwiki: https://test.wikipedia.org/w/index.php?title=SyntaxHighlight_GeSHi&action=purge so I can use that to verify.

Thank you to have raised those patches.

Risky Extension! 🚂🔥
  • Extension: CentralAuth
  • Summary:
    • CentralAuth is in charge of user authentication and parts of authorization. A significant amount of internal refactoring has been done in the last two weeks.
  • Test plan:
    • We've tried to increase test coverage during those refactoring.
    • Lots of manual testing.
    • Note that due to the design of the extension some actions are only available on metawiki and issues on them can't be found on group0.
  • Places to monitor: standard error dashboards + user reports for any weirdness
  • Revert plan: Rollback train (!), fix issue.
  • Affected wikis: all public ones
  • IRC Contact: majavah (urbanecm, DannyS712, Zabe can likely help too)
  • UBN Task Projects/tags: MediaWiki-extensions-CentralAuth

FYI: Title::getBacklinkCache has been hard deprecated while there were still multiple usages left. So there are going to be a few new deprecation warnings this week, all of them (related to Title::getBacklinkCache) should be covered by T290871, T290909 and T290914.

FYI: Title::getBacklinkCache has been hard deprecated while there were still multiple usages left. So there are going to be a few new deprecation warnings this week, all of them (related to Title::getBacklinkCache) should be covered by T290871, T290909 and T290914.

@Zabe: Can I kindly ask that fixes do get backported to 1.37 as this will overlap releases and I don't want to see too many new warnings.

Mentioned in SAL (#wikimedia-operations) [2021-09-14T08:24:11Z] <hashar> train: applied security patches for 1.37.0-wmf.23 # T281164

1.37.0-wmf.23 is now deployed to https://test.wikipedia.org/ . group0 wikis will be promoted later today at 19:00 UTC / 21:00 PST

@hashar I guess I can take over from here since I'm assigned. Looks like it's clear for deployment...

I made a mistake when promoting since I ran deploy-promote all which I have immediately aborted. But when doing deploy-promote group1 it actually deployed the commit forged by the first command which resulted in all wikis being promoted. Filed as T291130. It is not a train blocker though.

Zabe added a subscriber: Zabe.

Are we expecting group1 wikis to be on 1.37.0-wmf.23 since yesterday? They don't (or at least commons and wikidata don't) seem to be

@Cparle we had to rollback due to the various blockers we had. Be it AbuseFilter or some unserialization from cache (T291124).

I will promote group 1 wikis (which includes commons and wikidata) to 1.37.0-wmf.23 one hour from now (13:00 UTC - 15:00 CEST).

I have promoted group1 wikis to 1.37.0-wmf.23 and there are no new errors \o/

It is all quiet: SUCCESS!