Page MenuHomePhabricator

Add profiling for Varnish and VCL
Open, MediumPublic

Description

Per the parent task (T147101), figure out what it would take to capture stacktrace profiles from Varnish code (preserving symbols, especially those that come from our own VCL code).

Producing flame graphs from these periodically (e.g. hourly/daily) would allow operations and other VCL code contributors to better understand the performance impact of their changes, and help identify areas where perf has regressed, and help find areas where most time is spent and would benefit most from optimisations.

Assuming that the capturing of such profiles impacts performance itself, we'll need to limit it somehow. Some ideas to consider:

  • Limit instrumentation to only one (or other subset) of the Varnish frontends and backends.
  • Sample instrumentation within an instance (if possible), e.g. only for one-in-many requests, or for limited periods of time (e.g. a run-time way of toggling it on and off periodically, e.g. couple of seconds every minute or some such).
  • Perform instrumentation in a way that doesn't hinder execution (separate observer thread/process that takes periodic snapshots, thus passively sampling to just stacktraces from those moments in time).

The latter is what we do with Xenon/HHVM in production.

Event Timeline

Maybe something to discuss with Traffic and possibly collaborate on in a future quarter.

BBlack added a subscriber: BBlack.

The swap of Traffic for Traffic-Icebox in this ticket's set of tags was based on a bulk action for all such tickets that haven't been updated in 6 months or more. This does not imply any human judgement about the validity or importance of the task, and is simply the first step in a larger task cleanup effort. Further manual triage and/or requests for updates will happen this month for all such tickets. For more detail, have a look at the extended explanation on the main page of Traffic-Icebox . Thank you!