Page MenuHomePhabricator

Services team goals April - June 2016 (Q4 2015/16)
Closed, ResolvedPublic

Description

Early draft:

Core: REST API build-out, documentation & distribution

  • Support apps and the API-driven frontend effort with cacheable high-traffic entry points and clean content interfaces.
  • Expand mediawiki-containers environment to fully support RESTBase, and improve support for development use cases.

Focus: EventBus & change propagation

  • Introduce retry queues
  • Complete EventBus & change propagation multi-Dc support & test it: T127718
  • Migrate several jobs (in particular RestbaseUpdateJobs) to EventBus / change propagation service.

Dependencies: SRE, Analytics

Related Objects

StatusSubtypeAssignedTask
Resolved GWicke
Resolved mobrovac
Resolved mobrovac
ResolvedNone
Resolved GWicke
Resolved GWicke
ResolvedNone
Resolved GWicke
Resolved mobrovac
Resolved GWicke
Resolved GWicke
Resolvedfgiunchedi
Resolvedfgiunchedi
Resolved Cmjohnson
Resolved Cmjohnson
ResolvedJoe
Resolvedfgiunchedi
Resolved GWicke
Resolved Jdouglas
Resolved GWicke
Resolved GWicke
ResolvedArlolra
Resolved GWicke
Resolved mobrovac
Resolved mobrovac
Resolved mobrovac
Resolved mobrovac
Duplicate Jdouglas
ResolvedAndrew
Resolved GWicke
Resolvedfgiunchedi
Resolvedfgiunchedi
Resolvedfgiunchedi
ResolvedEevans
Resolvedfgiunchedi
Resolved GWicke
Resolved GWicke
Resolvedfgiunchedi
Resolved mobrovac
Resolved GWicke
InvalidNone
Resolved Pchelolo
ResolvedArlolra
Resolved mobrovac
Resolvedbd808
Resolved GWicke
DeclinedNone
Resolved Pchelolo
Resolved mobrovac
ResolvedNone
Resolved Pchelolo
Resolved GWicke
Resolved GWicke
Resolved GWicke
Resolved Pchelolo
OpenNone
ResolvedArlolra
Resolved Pchelolo
Resolved mobrovac

Event Timeline

I added the Focus (2) goal. Now, we need to decide on which goal to drop.

For the service discovery goal, we would need to clarify

a) which service discovery solution we are shooting for, and
b) which part of the overall task we are planning to take on in services.

Setting up etcd keys for a few service IPs is something we can contribute to, but ultimately something ops will need to own. Similarly, setting up front-ends for etcd (such as SkyDNS) would be an ops task.

The answer to b) depends a bit on which interface we plan to use. If the answer is 'DNS', then we would have to change approximately nothing in existing node services. If we plan to set up etcd watches or polling, then there would indeed be some work to do.

The integration of etcd information in config management / deployment systems is again something releng and ops prefer to own.

Overall, my feeling is that so far there is not very much substance that actually falls into services ownership. There are also questions about how this all fits with the move to containers and Kubernetes.

That said, I definitely agree that we need to continue work on fail-over support in EventBus & other services. If we can more clearly identify deliverables that we can actually achieve as a team, then I could see it work as a goal.

We discussed this in today's team meeting, and decided to reduce the number of goals to two, and focusing one more quarter on

  1. API build-out & service infrastructure, and
  2. change propagation build-out, including multi-DC EventBus.

Updated summary per team meeting. Also created an on-wiki variant at https://www.mediawiki.org/wiki/Wikimedia_Services/2015-16_Q4_Goals.

mobrovac claimed this task.