Page MenuHomePhabricator

Generate ATS cache.config from software-agnostic data structures
Closed, ResolvedPublic

Description

The decision of whether or not an origin server response has to be considered cacheable is normally based upon the Cache-Control response header. However, both ATS and Varnish allow to override it and avoid caching certain responses regardless of what the origin says. The mechanism provided by ATS to do so is based on a configuration file called cache.config, while Varnish uses a VCL action called pass for this purpose.

In our environment, the ATS config file cache.config is generated using the profile::trafficserver::backend::caching_rules hiera setting, which looks like this:

profile::trafficserver::backend::caching_rules:
    - primary_destination: dest_host
      value: cxserver.discovery.wmnet
      action: never-cache
    - primary_destination: dest_host
      value: debmonitor.discovery.wmnet
      action: never-cache

The data structure gets rendered pretty much one-to-one as:

# /etc/trafficserver/cache.config
dest_host=cxserver.discovery.wmnet action=never-cache
dest_host=debmonitor.discovery.wmnet action=never-cache

In the case of Varnish, the hiera settings are cache::req_handling and cache::alternate_domains, which among other things allow to skip the cache by specifying the caching: 'pass' attribute:

cache::req_handling:
  cxserver.wikimedia.org:
    caching: 'pass'
cache::alternate_domains:
  15.wikipedia.org:
    caching: 'normal'
  analytics.wikimedia.org:
    caching: 'normal'
  annual.wikimedia.org:
    caching: 'normal'
  blubberoid.wikimedia.org:
    caching: 'pass'
[...]
  config-master.wikimedia.org:
    caching: 'pass'

As Varnish does not provide an equivalent to Traffic Server's cache.config, we do the plumbing and dynamically generate a list of if/else VCL statements in the wm_recv_pass VCL subroutine, matching the incoming Host request headers to return pass statements as appropriate:

sub wm_recv_pass {

    if (req.http.host == "blubberoid.wikimedia.org") {
        return (pass);
    } elsif (req.http.host == "config-master.wikimedia.org") {
        return (pass);

Having two different mechanisms to specify one policy is not optimal, as engineers currently have to specify both an entry in cache::req_handling/alternate_domains based upon the Host header for the cache frontend layer, and another in profile::trafficserver::backend::caching_rules based on the origin server hostname for the cache backend. Given that at the ATS layer we know the Host -> Origin mapping:

profile::trafficserver::backend::mapping_rules:
    - type: map
      target: http://cxserver.wikimedia.org
      replacement: https://cxserver.discovery.wmnet:4002

it should be possible to get rid of profile::trafficserver::backend::caching_rules altogether and generate cache.config programmatically by iterating on cache::req_handling/ cache::alternate_domains, extracting the matching origin hostnames from profile::trafficserver::backend::mapping_rules.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema triaged this task as Medium priority.Aug 5 2020, 9:42 AM
ema moved this task from Backlog to Caching on the Traffic board.

Change 618537 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: add function profile::trafficserver_caching_rules

https://gerrit.wikimedia.org/r/618537

On a text node, applying the change results in the following diff:

--- /etc/trafficserver/cache.config.orig
+++ /etc/trafficserver/cache.config
@@ -2,13 +2,16 @@
 # This file is managed by Puppet.
 
 dest_host=cxserver.discovery.wmnet action=never-cache
+dest_host=blubberoid.discovery.wmnet action=never-cache
+dest_host=puppetmaster1001.eqiad.wmnet action=never-cache
+dest_host=thorium.eqiad.wmnet action=never-cache
 dest_host=debmonitor.discovery.wmnet action=never-cache
+dest_host=grafana-labs.discovery.wmnet action=never-cache
+dest_host=grafana2001.codfw.wmnet action=never-cache
 dest_host=grafana1002.eqiad.wmnet action=never-cache
-dest_host=grafana-labs.discovery.wmnet action=never-cache
-dest_host=graphite-labs.discovery.wmnet action=never-cache
-dest_host=mendelevium.eqiad.wmnet action=never-cache
 dest_host=mwmaint.discovery.wmnet action=never-cache
-dest_host=puppetmaster1001.eqiad.wmnet action=never-cache
 dest_host=peopleweb.discovery.wmnet action=never-cache
-dest_host=thorium.eqiad.wmnet action=never-cache
-dest_host=webserver-misc-apps.discovery.wmnet action=never-cache
+dest_host=puppetboard.discovery.wmnet action=never-cache
+dest_host=thanos-sso.discovery.wmnet action=never-cache
+dest_host=ticket.discovery.wmnet action=never-cache
+dest_host=otrs1001.eqiad.wmnet action=never-cache

With the exception of origins such as grafana-labs.discovery.wmnet that just got reordered within the file, the following are added: blubberoid, grafana2001, puppetboard, thanos-sso, ticket, otrs1001. Indeed they all look like they should be there. Also, the following leftovers got removed: graphite-labs, mendelevium, webserver-misc-apps, confirming the need for this change!

On a upload node instead, the file is now empty as expected, given that there is no site configured to unconditionally pass on cache_upload.

Change 618537 merged by Ema:
[operations/puppet@production] ATS: add function profile::trafficserver_caching_rules

https://gerrit.wikimedia.org/r/618537

Change 618960 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: align req_handling/mapping_rules on traffic-cache-atstext

https://gerrit.wikimedia.org/r/618960

Change 618960 merged by Ema:
[operations/puppet@production] ATS: align req_handling/mapping_rules on traffic-cache-atstext

https://gerrit.wikimedia.org/r/618960

Done, profile::trafficserver::backend::caching_rules is now gone. cache.config is generated by parsing req_handling and alternate_domains. Closing.