Page MenuHomePhabricator

Log the X-Cache header in the webrequest logs {hawk}
Closed, ResolvedPublic

Description

For a random non-operational log analysis that @Ironholds and I needed to do, we wanted to query for requests that were cache misses in (all of) esams and needed to be requested internally back from the eqiad origins. The "hit/miss" field by itself doesn't say anything for this, as these are Varnish *frontend* misses, which in turn can be either hits or misses in the esams backends.

Furthermore, as we scale up to another primary DC (codfw) and more caching PoPs, increasing the number of paths a request can take to be ultimately be served (in both Varnishes & appservers), it might be useful to record the whole path that a response took within our infrastructure, as recorded in the X-Cache header. This would be especially useful for erroneous requests (5xx etc.) but generally useful even for the generic webrequests and non-operational queries, such as the one we initially tried to run for this analysis.

Event Timeline

faidon raised the priority of this task from to Needs Triage.
faidon updated the task description. (Show Details)
faidon added subscribers: faidon, Ironholds, BBlack.
Ottomata triaged this task as Medium priority.
Ottomata set Security to None.
Ottomata added a subscriber: Ironholds.

Sounds fun! I wonder if there are any other fields we want to add. If I do this, it would be easier to do them all at the same time.

Oh god we're using the CLF? That thing haunts my nightmares.

More seriously: I think I'm fine with what we've got, absent X-Cache :)

And these:

HTTP/1.1 200 OK
Server: Apache
X-Content-Type-Options: nosniff
X-Analytics: page_id=1093068;ns=0
Content-language: en
X-UA-Compatible: IE=Edge
Vary: Accept-Encoding,Cookie
X-Powered-By: HHVM/3.3.1
Last-Modified: Wed, 11 Mar 2015 15:36:04 GMT
Content-Type: text/html; charset=UTF-8
X-Varnish: 1364737053 1358446976, 3209694142 2274307354
Via: 1.1 varnish, 1.1 varnish
Date: Fri, 20 Mar 2015 18:18:26 GMT
Age: 787338
Connection: keep-alive
X-Cache: cp1054 hit (9), cp1065 frontend hit (574)
Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
Set-Cookie: GeoIP=US:Brooklyn:40.6944:-73.9906:v4; Path=/; Domain=.wikipedia.org

I just what happens when adding columns to Hive tables backed by JSON Serde. It is weird:

Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to org.apache.hcatalog.data.HCatRecord

However, I was able to create a new table with a new column, and add partitions to it that pointed to JSON data that did not have that field, and Hive dealt with this fine. NULL was returned in place of the absent field. Even select * worked.

So, we can easily do this by:

  1. use wmf_raw; drop table webrequest;
  2. create table webrequest witih x_cache (and other?) columns
  3. Manually re-add all existing partitions.
  4. Test that everything works fine.
  5. Modify varnishkafka format configs and restart varnishkafka (via puppet).

We can then follow the same steps we did for adding a column to the refined webrequest table, and alter the refinement oozie job accordingly.

Should be pretty easy.

Cool, just did steps 1-3! Will do the varnishkafka change next.

Change 198809 had a related patch set uploaded (by Ottomata):
Log X-Cache header as new field in webrequest logs via varnishkafka

https://gerrit.wikimedia.org/r/198809

Change 198809 merged by Ottomata:
Log X-Cache header as new field in webrequest logs via varnishkafka

https://gerrit.wikimedia.org/r/198809

varnishkafka format change deployed. I'll wait a day (or a few hours) to verify that x_cache data is present in new raw webrequest records.

Joseph, once that looks good, we can add this field to the refined table and then change the oozie jobs to use all 3 new fields we are adding.

kevinator renamed this task from Log the X-Cache header in the webrequest logs to Log the X-Cache header in the webrequest logs {hawk}.Mar 31 2015, 4:34 AM