Page MenuHomePhabricator

[Spike] Is Varnish caching ORES responses?
Open, NormalPublic

Description

We don't want Varnish front-end caching, because it isn't currently able to purge scores when the ML models are regenerated. However, I see signs that we might be getting cached in Varnished.

We should be including the following headers with our responses:
https://github.com/wiki-ai/ores/blob/master/ores/wsgi/util.py#L93

However, hitting localhost with cURL from ores-web-04, the headers don't appear,

curl -D headers.txt http://localhost:8080/v2/scores/enwiki/damaging/745065890/

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 328
Access-Control-Allow-Origin: *

{
  "scores": {
    "enwiki": {
      "damaging": {
        "scores": {
          "745065890": {
            "prediction": false,
            "probability": {
              "false": 0.9712963561026546,
              "true": 0.028703643897345314
            }
          }
        },
        "version": "0.1.1"
      }
    }
  }
}

The old model version is also extremely suspicious. Maybe I'm on an older web box?

Here's the VCL configuration which should match on our Cache-Control header and prevent caching:
https://github.com/wikimedia/operations-puppet/blob/production/modules/varnish/templates/vcl/wikimedia-common.inc.vcl.erb#L331

Production responses are scaring me, they include headers like,

https://ores.wikimedia.org/v2/scores/enwiki/damaging/745065890/

Accept-Ranges: bytes
Age: 2582
Content-Encoding: gzip
Content-Length: 176
Content-Type: application/json
Date: Wed, 26 Oct 2016 17:35:42 GMT
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
Vary: Accept-Encoding
Via: 1.1 varnish-v4, 1.1 varnish-v4, 1.1 varnish-v4, 1.1 varnish-v4
X-Cache: cp1061 miss, cp2012 miss, cp4002 miss, cp4001 hit/2
X-Firefox-Spdy: h2
access-control-allow-origin: *
x-analytics: WMF-Last-Access=26-Oct-2016;https=1
x-cache-status: hit
x-client-ip: <redacted>
x-varnish: 26581384, 24460985, 9979155, 819430 469613

I don't like the cp4001 hit, especially as ganglia reports that machine is down. Hard refresh changes nothing.

Also, the cached score doesn't exist on ores-redis-02 port 6380,

keys get ores:enwiki:damaging:745065890:*

to be continued...

Event Timeline

awight created this task.Oct 26 2016, 6:02 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 26 2016, 6:02 PM

I think I must be looking at staging boxes. Can someone help me access the production cluster?

Halfak moved this task from Untriaged to Research & analysis on the Scoring-platform-team board.
Halfak triaged this task as Normal priority.
Ladsgroup renamed this task from Spike: Is Varnish caching ORES responses? to [Spike] Is Varnish caching ORES responses?.Nov 10 2016, 3:28 PM
Ladsgroup added a project: Spike.
Tgr added a subscriber: Tgr.Feb 14 2017, 3:16 AM
$ curl -I https://ores.wikimedia.org/v2/scores/enwiki/damaging/745065890/ | grep X-Cache-Status
X-Cache-Status: pass

so it's not being cached anymore.

In any case, requests sent to localhost cannot be cached in Varnish (unless the local host is one of the Varnish boxes).

awight removed a subscriber: awight.Mar 21 2019, 4:00 PM