Page MenuHomePhabricator

Make sure ResourceLoader works with MediaWiki read-only mode
Closed, ResolvedPublic

Description

This is related to T113916 but for the short-term for the Eqiad/Codfw switch-over.

We need to make sure that when the module_deps table doesn't accept writes, we fail gracefully. Both in terms of the immediate request being handled, as well as the overall integrity of client-side cache and module version numbers (no wrong pollution happening).

The immediate request fails gracefully, presumably with a stale or less-functional version of a stylesheet. But I'm not sure how this affects the larger model of version hashes and how these requests end up populating Varnish caches for those urls.

Event Timeline

Krinkle raised the priority of this task from to Needs Triage.
Krinkle updated the task description. (Show Details)
Krinkle subscribed.
faidon triaged this task as High priority.Feb 24 2016, 5:05 PM
Krinkle moved this task from Inbox to Backlog on the MediaWiki-ResourceLoader board.
Krinkle moved this task from Backlog to Assigned on the MediaWiki-ResourceLoader board.
Krinkle added a subscriber: aaron.

There's only one method left in ResourceLoader that requires a master connection. It's ResourceLoaderModule::saveFileDependencies(), called only from ResourceLoaderFileModule::getStyles(). While it is the only call remaining, it is also one of the worst it could be. Namely, it happens on GET requests to load.php, by design.

It's part of the module_deps system, which is up for refactoring to not use the database at all (T113916). However, that won't happen before the codfw rollout.

  • The write query only happens on cache-miss. The response is cached well by Varnish.
  • The write query only happens if the data in the database is stale or absent. It generally happens within 5 minutes of a deployment. The code path is not typically hit when we don't deploy code. Though it's pazy populated, so when a combination of wiki+skin didn't have views yet since the last deployment, it can happen at any time.
  • The write query is also guarded by try/catch.
  • The write query is not essential to the web response of that request. The stylesheet will be generated fine and is up-to-date, with or without that query succeeding.

The write query's success is only important to ensure the startup module reflects a change to a module. The stylesheet web response collects information about the stylesheet, saves it to the database, and subsequent requests to the startup metadata will reflect those changes.

If the stylesheet response is unable to saves it's discoveries to the database, the startup module should simply continue to broadcast the previous version hash of the module until a future stylesheet request is able to save the information, at which point things will be eventually consistent with a few minutes.

Worst case scenario, users of a skin+wiki combination not often visited will continue to see the version of the stylesheet they last saw instead of whatever version was deployed since then. And all of this is increasingly unlikely if we don't deploy changes in the minutes leading up to the activation of read-only mode.

I've verified this locally when using wgReadOnly. I'lll verify it tomorrow in production on mw1017 as well.