Page MenuHomePhabricator

AQS 2.0 Device analytics Routing production traffic
Closed, ResolvedPublic

Description

This task has two major components :-

Deploying the latest image on production
Shifting traffic to newly deployed service

Event Timeline

Hi @hnowlan As discussed over slack , the device analytics gerrit repo has the latest working code base . Please refer https://gerrit.wikimedia.org/r/plugins/gitiles/generated-data-platform/aqs/device-analytics/+/7b64477fa85782c2e796598f3c601d2b0e139fd0. This needs to be deployed on production.
Also , please update the ticket on how we are going to bleed traffic to this service post discussion with traffic team.

Latest image has been deployed to production.

Some notes on how we might do the changeover:

  • We can match on the URL in Varnish and use a random number to route a percentage of requests over to the new service (basically checking whether random(100) < PERCENTAGE_LIMIT).
  • We will rewrite the request to query the device-analytics.discovery.wmnet service in Varnish
  • We will need to be careful about how we split this traffic, as we're introducing a service that could have some edge behaviours we haven't caught yet and might pollute the cache. That means we'll need to split the cache

We are still blocked on the decision whether to use a gateway in front of the service or route requests directly.

@hnowlan let's put the gateway in front of the service. It makes sense for future traffic management that we may want to do and for consistency with future services.

Change 930214 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] api-gateway: add device-analytics service

https://gerrit.wikimedia.org/r/930214

Change 930216 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/puppet@production] trafficserver: add route for device-analytics service

https://gerrit.wikimedia.org/r/930216

Change 930214 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: add device-analytics service

https://gerrit.wikimedia.org/r/930214

Change 935457 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] api-gateway: add native AQS1-style routes for AQS services

https://gerrit.wikimedia.org/r/935457

Change 935457 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: add native AQS1-style routes for AQS services

https://gerrit.wikimedia.org/r/935457

Change 936765 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] api-gateway: emit no-cache unless otherwise asked

https://gerrit.wikimedia.org/r/936765

Change 937061 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/puppet@production] cache: set api.wikimedia.org to normal caching

https://gerrit.wikimedia.org/r/937061

Change 936765 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: emit no-cache unless otherwise asked

https://gerrit.wikimedia.org/r/936765

Change 937061 merged by Hnowlan:

[operations/puppet@production] cache: set api.wikimedia.org to normal caching

https://gerrit.wikimedia.org/r/937061

Change 930216 abandoned by Hnowlan:

[operations/puppet@production] trafficserver: add route for device-analytics service

Reason:

Already done in another CR using the routing script.

https://gerrit.wikimedia.org/r/930216