Page MenuHomePhabricator

Implement TTL cap for ats-be
Closed, ResolvedPublic

Description

We are currently not imposing any limit on the TTL for objects cached at the ats-be layer, accepting whatever the origin servers send. Our VCL does have such a measure in place, and we should add it to ats too.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema triaged this task as Medium priority.Apr 7 2020, 3:45 PM
ema moved this task from Triage to Caching on the Traffic board.

I was under the impression that ATS had no config setting to impose a TTL cap. The reason for this is that the documentation for "freshness" explicitly says that those settings are imposed only if TTL heuristics are used and we don't use them.

Establishes a guaranteed maximum lifetime boundary for freshness heuristics. When heuristics are used, and the proxy.config.http.cache.heuristic_lm_factor aging factor is applied, the final maximum age calculated will never be higher than the value in this variable.

However, by looking at the code it seemed that the functions what_is_document_freshness and calculate_document_freshness_limit were not dependent on heuristics.

I have thus tried setting the value on cp3050:

$ sudo traffic_ctl config set proxy.config.http.cache.guaranteed_max_lifetime 86400

Indeed looking at what's going on with systemtap the limit is taken into account:

$ sudo stap -ve 'probe process("/usr/bin/traffic_server").statement("what_is_document_freshness@./proxy/http/HttpTransact.cc:7178") { printf("fresh_limit=%d current_age=%d\n", $fresh_limit, $current_age) } '
fresh_limit=300 current_age=95
fresh_limit=10800 current_age=5430
fresh_limit=86400 current_age=1605
fresh_limit=86400 current_age=56007
fresh_limit=86400 current_age=10832
[...]

Trying setting the limit very low for a little bit (100s), no object with Age greater than that was returned by cp3050:

varnishncsa -g request -n frontend -q 'BerespHeader:Age > 100 and BerespHeader:X-Cache-Int ~ "cp3050"'
# silence

Resetting the limit to the default value, silence stops.

Bottom line: we can set proxy.config.http.cache.guaranteed_max_lifetime to 86400 and be done with this, after verifying with upstream that this is a feature and not a bug.

Change 589317 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] ATS: cap ats-be TTL at 24h

https://gerrit.wikimedia.org/r/589317

Change 589317 merged by Ema:
[operations/puppet@production] ATS: cap ats-be TTL at 24h

https://gerrit.wikimedia.org/r/589317