Page MenuHomePhabricator

Expose cache host that served the response via Server Timing and collect it with navtiming daemon
Closed, ResolvedPublic

Description

Currently the server-timing header only expose the cache status (hit, miss, pass, etc.).

In order to restore the ability to collect per-host data in the navtiming daemon, we need a way for the client code to know which DC/host served the request to them.

This can be achieved by extending the contents of the server-timing header to also include the name of the cache host that served the request.

For example, instead of this:

server-timing: cache;desc="pass"

We could serve this:

server-timing: cache;desc="pass", host;desc="cp3056"

We used to get this information for free in the old EventLogging pipeline, because event.gif was served from the same edge cache host as the rest of the requests, but that's not the case anymore with EventGate.

Event Timeline

Gilles triaged this task as Medium priority.Mar 18 2021, 3:48 PM

Change 673295 had a related patch set uploaded (by Gilles; owner: Gilles):
[operations/puppet@production] Add edge cache hostname to Server-Timing header

https://gerrit.wikimedia.org/r/673295

Change 674265 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] VCL: test Server-Timing response header

https://gerrit.wikimedia.org/r/674265

Change 674265 merged by Ema:
[operations/puppet@production] VCL: test Server-Timing response header

https://gerrit.wikimedia.org/r/674265

Change 673295 merged by Ema:
[operations/puppet@production] Add edge cache hostname to Server-Timing header

https://gerrit.wikimedia.org/r/673295

Change 675105 had a related patch set uploaded (by Gilles; author: Gilles):
[schemas/event/secondary@master] Add cacheHost field to NavigationTiming schema

https://gerrit.wikimedia.org/r/675105

Change 675107 had a related patch set uploaded (by Gilles; author: Gilles):
[mediawiki/extensions/NavigationTiming@master] Collect cache host from Server-Timing header

https://gerrit.wikimedia.org/r/675107

Gilles renamed this task from Expose cache host that served the response via Server Timing to Expose cache host that served the response via Server Timing and collect it with navtiming daemon.Mar 31 2021, 9:17 AM

Change 676006 had a related patch set uploaded (by Gilles; author: Gilles):

[performance/navtiming@master] Collect host information from NavigationTiming schema

https://gerrit.wikimedia.org/r/676006

Change 675105 merged by jenkins-bot:

[schemas/event/secondary@master] Add cacheHost field to NavigationTiming schema

https://gerrit.wikimedia.org/r/675105

Change 675107 merged by jenkins-bot:

[mediawiki/extensions/NavigationTiming@master] Collect cache host from Server-Timing header

https://gerrit.wikimedia.org/r/675107

Change 676006 merged by jenkins-bot:

[performance/navtiming@master] Collect host information from NavigationTiming schema

https://gerrit.wikimedia.org/r/676006

The data is confirmed to make it to the NavigationTiming schema for browsers that support Server-Timing:

0: jdbc:hive2://analytics-hive.eqiad.wmnet:10> SELECT event.cacheHost FROM event.navigationtiming WHERE year = 2021 AND month = 4 AND day = 26 LIMIT 10;
going to print operations logs
printed operations logs
Getting log thread is interrupted, since query is done!
cachehost
NULL
cp2029
cp4029
cp1083
cp4031
cp2041
cp2037
cp1079
cp3056
cp3062