Page MenuHomePhabricator

Roll out Restbase to production
Closed, ResolvedPublic

Description

Incrementally roll out Mobile-Content-Service and other RESTBase endpoint usage by the Android Production app. (See T118965 for beta app roll-out.)

This will be done via the MobileApp extension which provides a remote config mechanism.
We are going to add a new key restbaseProdPercent and increment the percentage of production app installations eligible for using the new endpoints. The config file is located at https://meta.wikimedia.org/static/current/extensions/MobileApp/config/android.json.

  • Stage 1: 3/24/2016: 2%
  • Stage 2: 3/28/2016: 25%
  • Stage 3: 4/4/2016: 50%
  • Stage 4: 4/11: 100%

After deployment check https://meta.wikimedia.org/static/current/extensions/MobileApp/config/android.json
Will need a cache purge on https://en.wikipedia.org/static/current/extensions/MobileApp/config/android.json. (note the en instead of meta)

echo "https://en.wikipedia.org/static/current/extensions/MobileApp/config/android.json" | mwscript purgeList.php

Event Timeline

Dbrant raised the priority of this task from to Needs Triage.
Dbrant updated the task description. (Show Details)
Dbrant moved this task to Tracking on the Wikipedia-Android-App-Backlog board.
Dbrant subscribed.

Change 274848 had a related patch set uploaded (by BearND):
Change RB remote config key

https://gerrit.wikimedia.org/r/274848

Change 274874 had a related patch set uploaded (by BearND):
Roll out RESTBase usage to Android production app: 25%

https://gerrit.wikimedia.org/r/274874

Change 274848 merged by jenkins-bot:
Change RB remote config key for production app

https://gerrit.wikimedia.org/r/274848

@bearND, we are planning to switch RESTBase traffic from Eqiad to Codfw next week: T127974. While this is fairly straightforward, we should time it so that it doesn't coincide with the app ramp-up.

@GWicke Good point. When is it done? Is moving everything a week later ok?

+1 to waiting until the RESTBase move to codfw is complete before beginning the rollout to the stable Android app. That'll give us plenty of time to get our outstanding work merged/deployed/tested without rushing to meet a rollout deadline.

https://gerrit.wikimedia.org/r/#/c/275702/, for example, is a pretty important fix I'd like to see get in before we start the rollout.

When is it done? Is moving everything a week later ok?

Switching the public traffic will be a single puppet change to update the varnish config (RB and Parsoid are active in both eqiad and codfw), but there are a few other services (including mobileapps) which we'll then need to reconfigure to use the codfw RB as well. It should not take more than an hour or so total, with no user-visible impact.

The exact time will be finalized tomorrow. If we switched on Tuesday, then technically the roll-out could go ahead on Thursday. I just don't want both to happen at the same time.

OK, we could do a SWAT deploy a couple of days after the switchover is complete if we want more control over when this gets out.

Update on the DC switch-over: This is scheduled for three hours tomorrow morning European time. After that, we'll switch back to eqiad. No further switching is planned until later in April, when MediaWiki should be ready. See T127974 for details.

We are going to implement a solution for T118306 today on the RESTBase side. This will then likely make it to production on Monday.

Change 278949 had a related patch set uploaded (by Mholloway):
Roll out RESTBase usage to Android production app: 2%

https://gerrit.wikimedia.org/r/278949

^ 1.27.0-wmf.18 cherry-pick patch scheduled for 3/24 morning SWAT

Change 274874 merged by jenkins-bot:
Roll out RESTBase usage to Android production app: 2%

https://gerrit.wikimedia.org/r/274874

Change 278949 merged by jenkins-bot:
Roll out RESTBase usage to Android production app: 2%

https://gerrit.wikimedia.org/r/278949

2% patch deployed, cache purged. Note that we now need to use the en.wikipedia.org domain for the purge command, see T130904.

Change 279977 had a related patch set uploaded (by BearND):
Roll out RESTBase usage to Android production app: 25%

https://gerrit.wikimedia.org/r/279977

Change 279977 merged by jenkins-bot:
Roll out RESTBase usage to Android production app: 25%

https://gerrit.wikimedia.org/r/279977

Change 279980 had a related patch set uploaded (by BearND):
Roll out RESTBase usage to Android production app: 25%

https://gerrit.wikimedia.org/r/279980

Change 279980 merged by jenkins-bot:
Roll out RESTBase usage to Android production app: 25%

https://gerrit.wikimedia.org/r/279980

25% rollout initiated, verified in prod app.

Change 280952 had a related patch set uploaded (by BearND):
Roll out RESTBase usage to Android production app: 50%

https://gerrit.wikimedia.org/r/280952

Change 280952 merged by jenkins-bot:
Roll out RESTBase usage to Android production app: 50%

https://gerrit.wikimedia.org/r/280952

Change 280957 had a related patch set uploaded (by BearND):
Roll out RESTBase usage to Android production app: 50%

https://gerrit.wikimedia.org/r/280957

Change 280957 merged by jenkins-bot:
Roll out RESTBase usage to Android production app: 50%

https://gerrit.wikimedia.org/r/280957

50% rollout initiated, verified in prod app.

Change 282389 had a related patch set uploaded (by BearND):
Roll out RESTBase usage to Android production app: 100%

https://gerrit.wikimedia.org/r/282389

Change 282434 had a related patch set uploaded (by BearND):
Roll out RESTBase usage to Android production app: 100%

https://gerrit.wikimedia.org/r/282434

^ 1.27.0-wmf.20 cherry-pick patch scheduled for 4/11 morning SWAT.

Change 282434 merged by jenkins-bot:
Roll out RESTBase usage to Android production app: 100%

https://gerrit.wikimedia.org/r/282434

It's merged and synced but for some reason not showing up yet, even with cache busting query parameters and purging the cache. We might need to give it some more time.

bearND updated the task description. (Show Details)

It took some time but the config was updated a couple of days ago.

The mobileapps dashboard concurs:

pasted_file (976×1 px, 176 KB)

It also shows that clients are still in the process of picking up the config change.

Congratulations to both teams! It has been a great collaboration so far, and I'm looking forward to the next round this quarter.