Page MenuHomePhabricator

Consider collecting more timestamp milestones from ATS-TLS
Open, MediumPublic

Description

Currently this time delta is being collected by atskafka via the analytics named pipe:

TS_MILESTONE_UA_BEGIN_WRITE-TS_MILESTONE_SM_START

TS_MILESTONE_UA_BEGIN_WRITE is emitted by ATS-TLS just before it writes the data to the client socket. Beyond that point, different layers of buffering can happen that may slow down the actual delivery of that data to the client.

I think it would be interesting to collect the remaining communication time between ATS-TLS and the client, using either of those:

TS_MILESTONE_UA_CLOSE - TS_MILESTONE_SM_START
TS_MILESTONE_UA_CLOSE - TS_MILESTONE_UA_BEGIN_WRITE

This will give us telemetry about how long it took to ship data to the client and have it acknowledged.

Likewise, we could collect the following in order to sanity check that Varnish takes the amount of time it thinks it does to deliver data to ATS-TLS:

TS_MILESTONE_SERVER_CLOSE - TS_MILESTONE_SERVER_CONNECT

Both of these would give us a more complete picture of where time is spent.

Event Timeline

Gilles created this task.Oct 19 2020, 7:52 AM
ema moved this task from Triage to Caching on the Traffic board.Oct 19 2020, 2:32 PM
ema triaged this task as Medium priority.Oct 20 2020, 9:26 AM

Change 635276 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: add metric trafficserver_tls_client_total_time

https://gerrit.wikimedia.org/r/635276

Change 635276 merged by Ema:
[operations/puppet@production] ATS: add metric trafficserver_tls_client_total_time

https://gerrit.wikimedia.org/r/635276