Page MenuHomePhabricator

Make maps active / active
Closed, ResolvedPublic

Description

Traffic is ready for active / active applications, Maps is ready to be active / active, we should do it.

Some validation of the codfw cluster is needed before sending user traffic there.

Event Timeline

Gehel created this task.Apr 6 2017, 2:07 PM
Restricted Application added a project: Discovery. · View Herald TranscriptApr 6 2017, 2:07 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Gehel added a comment.Apr 6 2017, 2:11 PM

Looking at Tasmania on the maps / codfw cluster, it looks like we did not regenerate all tiles after the T159631 incident. This is now in progress.

It also looks like the rendering is slightly different, with different location names being shown. For example, compare this production tile (eqiad) with its codfw equivalent (require a SSH tunnel to one of the codfw maps server). You can see that "Hamilton" is shown on the codfw cluster but not on eqiad. This could cause minor issue for users switching between DC.

ema moved this task from Triage to Watching on the Traffic board.Apr 11 2017, 12:31 PM
fgiunchedi triaged this task as Normal priority.Apr 12 2017, 7:58 AM

Unless you take special measures two tile servers with the same style and data may render labels differently. Generally this is caused by queries which don't fully order their results and is not generally regarded as a problem, so long as the ordering from one server is consistent.

debt added a subscriber: debt.

Moving off the sprint board - the Discovery team won't be able to do this work at this time.

Gehel added a comment.Sep 14 2017, 9:45 AM

@Pnorman could you have a look at the codfw servers and see if we are ready to move on this?

For reference, the puppet change to do: https://gerrit.wikimedia.org/r/#/c/345591/

Pnorman moved this task from Backlog to In progress on the Maps-Sprint board.Sep 14 2017, 6:51 PM

I used a SSH tunnel to check maps2001.codfw.wmnet and it's serving tiles fine. One problem I noticed is that it is at least two months out of date on what it renders. The database is up to date, so this is proobably from T175123: tileshell does not honor redis configuration in /etc/tilerator/config.yaml.

This is also present on production, so it's no barrier to going active/active.

Gehel claimed this task.Sep 19 2017, 7:05 PM

Change 379530 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/puppet@production] maps: active/active public interface

https://gerrit.wikimedia.org/r/379530

Gehel added subscribers: ema, BBlack.Sep 21 2017, 1:39 PM

We are ready to make maps active / active. Patch https://gerrit.wikimedia.org/r/#/c/379530/ is ready to be merged, but I'll let the traffic team (@ema / @BBlack) merge it, they understand the traffic side of it much better than I do!

Change 379530 merged by BBlack:
[operations/puppet@production] maps: active/active public interface

https://gerrit.wikimedia.org/r/379530

Gehel moved this task from In progress to Done on the Maps-Sprint board.Sep 21 2017, 5:16 PM
debt closed this task as Resolved.Sep 22 2017, 1:48 PM

Thanks @BBlack and @Gehel !