Page MenuHomePhabricator

Send X-Analytics information from Varnish to Hadoop with VCL_Log
Closed, ResolvedPublic

Description

Currently X-Analytics data is passed from Varnish to Hadoop via HTTP headers. Those headers are then passed to clients, which sends pointless data to end users.

Instead of relying on headers, this data could be passed by Varnish via VCL_Log, which lets it pass data to the backend without exposing headers to the client.

This would reduce our outbound bandwidth and the amount of useless data in the first TCP packet the client received, leaving more room for actual content actionable by the web browser.

Details

Related Gerrit Patches:

Event Timeline

mforns created this task.Jun 6 2018, 3:01 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 6 2018, 3:01 PM

See the investigation results at T188807 and the recently updated documentation at https://wikitech.wikimedia.org/wiki/X-Analytics .

elukey added a subscriber: elukey.Jun 6 2018, 4:17 PM
mforns assigned this task to Nuria.Jun 6 2018, 4:17 PM
mforns added a project: Analytics-Kanban.
Nuria added a comment.Jun 6 2018, 5:06 PM

@Krinkle: see documentation and ticket liked from wikitech, while the shear majority of requests are indeed under https there are some that are not. I think a simplified way to [put it is that when we serve content we do so under https, it might very well be that the initial request (met with a 301 or other) was under http. Hopefully this makes sense. Closing ticket but reopen if you feel explanation is not sufficient.

Nuria closed this task as Declined.Jun 6 2018, 5:06 PM
Krinkle reopened this task as Open.EditedJun 6 2018, 5:54 PM

I'll re-open this for a slightly different purpose, which is to find a way to avoid sending X-Analytics as data to the end user from Varnish. Instead, we can explore other options to communicate between Varnish VCL and the consumer of this field (which is varnishkafka > webrequests table in Hive/Hadoop).

That was actually the reason I pinged @mforns earlier today when this task was filed.

@ema Could we use std.log (VCL_Log) to report X-Analytics data and stop the header from reaching the final user?

Krinkle renamed this task from Study whether we can default to https for all webrequests and remove https=1 header from x-Analytics to Evaluate alternate means to send X-Analytics information from Varnish to Hadoop..Jun 6 2018, 5:54 PM
fdans triaged this task as Low priority.Jun 11 2018, 4:07 PM
fdans moved this task from Incoming to Operational Excellence on the Analytics board.
Vvjjkkii renamed this task from Evaluate alternate means to send X-Analytics information from Varnish to Hadoop. to djbaaaaaaa.Jul 1 2018, 1:05 AM
Vvjjkkii removed Nuria as the assignee of this task.
Vvjjkkii raised the priority of this task from Low to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii edited subscribers, added: Nuria; removed: Aklapper.
Community_Tech_bot renamed this task from djbaaaaaaa to Evaluate alternate means to send X-Analytics information from Varnish to Hadoop..Jul 1 2018, 6:23 AM
Community_Tech_bot assigned this task to Nuria.
Community_Tech_bot updated the task description. (Show Details)
Community_Tech_bot edited subscribers, added: Aklapper; removed: Nuria.
Tbayer lowered the priority of this task from High to Low.Jul 1 2018, 2:39 PM
Nuria raised the priority of this task from Low to Needs Triage.Sep 26 2018, 7:24 PM
Nuria moved this task from Operational Excellence to Backlog (Later) on the Analytics board.
Krinkle triaged this task as Medium priority.Sep 15 2019, 2:05 AM
Krinkle moved this task from Watching to Perf recommendation on the Performance-Team (Radar) board.
Krinkle updated the task description. (Show Details)Dec 9 2019, 9:03 PM
Nuria added a comment.Dec 9 2019, 10:12 PM

I think we need more info here of why is this important and what you would like to see.

Gilles renamed this task from Evaluate alternate means to send X-Analytics information from Varnish to Hadoop. to Send X-Analytics information from Varnish to Hadoop with VCL_Log.Dec 10 2019, 2:12 PM
Gilles updated the task description. (Show Details)
Gilles added a subscriber: Gilles.

@ema what's the current state or plan for this in ATS?

I think we need more info here of why is this important and what you would like to see.

The objective is to not send X-Analytics in HTTP response headers to end-users. This seems mostly wasted bandwidth.

If I understand correctly, this header is currently used for two things internally:

  1. To transport internal state information about the request during VCL execution to to varnishlog, where it can be picked up by Hadoop etc.
  2. To transport internal state information from MediaWiki to the VCL execution.

For case 1, I believe there are other means to transport this information within Varnish that don't require adding it as public property to resp.headers. For example using std.log (VCL_Log) in Varnish. Previous comments and Traffic team knows more about this.

For case 2, headers are unavoidable but that's only between the app server and Varnish frontend. When they are read by the Analytics' VCL code and merged into VCL_Log, the header can be unset after that.

In the immediate future there is not yet a schedule to remove use of Varnish frontend (nginx-tls becomes ats-frontend, and varnish-backend becomes ats-backend. But Varnish frontend for caching and web request logging remains in the middle as today, for now).

But Varnish frontend for caching and web request logging remains in the middle as today, for now).

  • i think* next quarter we are starting to migrate varnish frontend to AST? In which case it seems a good time to take a second look to this workflow. cc Traffic and @ema so they can clarify whether the varnish frontend migration is happening come Q3
ema added a comment.Dec 19 2019, 10:12 AM

@ema what's the current state or plan for this in ATS?

In ATS there is no direct equivalent of VCL_Log + reading things out of VSM, but we can investigate alternative routes to achieve the same goal in Q3.

But Varnish frontend for caching and web request logging remains in the middle as today, for now).

  • i think* next quarter we are starting to migrate varnish frontend to AST? In which case it seems a good time to take a second look to this workflow. cc Traffic and @ema so they can clarify whether the varnish frontend migration is happening come Q3

The plan for Q3 is to (begin? it's a non-negligible amount of work) the migration of all analytics outputs from varnish-fe to ats-tls.

Since the migration work is starting in Q3, it sounds like this task would be obsolete at some point in 2020, right? I guess as part of the migration it would be nice to make it a requirement to get rid of those headers being sent to end users, granted that ATS is capable of doing that.

Change 559711 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: remove X-Analytics from responses sent to users

https://gerrit.wikimedia.org/r/559711

ema added a comment.Dec 20 2019, 9:48 AM

Since the migration work is starting in Q3, it sounds like this task would be obsolete at some point in 2020, right? I guess as part of the migration it would be nice to make it a requirement to get rid of those headers being sent to end users, granted that ATS is capable of doing that.

I propose we stop sending the header to clients early in Q3, removing it at the ats-tls layer: https://gerrit.wikimedia.org/r/559711. The ats-tls analytics implementation will have to to work in spite of this, obviously, making it something of an implicit requirement.

Restricted Application added a project: Operations. · View Herald TranscriptDec 20 2019, 9:50 AM
ema moved this task from Triage to Caching on the Traffic board.Dec 20 2019, 12:33 PM

Change 559711 merged by Ema:
[operations/puppet@production] ATS: remove X-Analytics from responses sent to users

https://gerrit.wikimedia.org/r/559711

Mentioned in SAL (#wikimedia-operations) [2020-01-15T15:37:51Z] <vgutierrez> rolling restart of ats-tls instances - T196558 T242778

Milimetric moved this task from Backlog (Later) to Radar on the Analytics board.Mon, Mar 9, 4:48 PM
Krinkle closed this task as Resolved.Mon, Mar 9, 6:16 PM