Page MenuHomePhabricator

Remove must-revalidate cache-control header from mobile-sections-lead and mobile-sections-remaining responses
Open, NormalPublic1 Story Points

Description

The Android app, as a client of the Content Service, wants to cache content aggressively and correctly. This means specifying a great max-stale cache-control header in requests but responses with the no-cache or must-revalidate cache-control option are forbidden from being used unless a successful response from the service is received[0]. This ticket encompasses the work necessary to remove the must-revalidate header option from the mobile-sections-* endpoints.

Currently:

$ curl -Is https://en.wikipedia.org/api/rest_v1/page/mobile-sections-lead/Dog | grep cache-control:
cache-control: s-maxage=1209600, max-age=0, must-revalidate

Expected:

$ curl -Is https://en.wikipedia.org/api/rest_v1/page/mobile-sections-lead/Dog | grep cache-control:
cache-control: s-maxage=1209600, max-age=0

[0] https://tools.ietf.org/html/rfc7234#section-5.2.1

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 31 2017, 5:26 PM

The must-revalidate directive is there mainly to prevent vandalism, but also to ensure that the client always has the freshest response in cache. I believe this ticket is in the context of offline usage. While the RFC clearly states that the client must check with the server, it can also detect that it is offline and hence it has real-world obstacles from fulfilling the contract. Removing the directive altogether in order to accommodate such an edge case doesn't seem prudent to me, TBH.

While the RFC clearly states that the client must check with the server, it can also detect that it is offline and hence it has real-world obstacles from fulfilling the contract.

No, the RFC states that 504 MUST be returned in this case: https://tools.ietf.org/html/rfc7234#section-5.2.2.1

While the RFC clearly states that the client must check with the server, it can also detect that it is offline and hence it has real-world obstacles from fulfilling the contract.

No, the RFC states that 504 MUST be returned in this case: https://tools.ietf.org/html/rfc7234#section-5.2.2.1

That presumes the request has reached the cache in the first place, which is not true for offline mode, While cache here is rather open to interpretation and can include the client's cache, I still think that offline mode is such a special case that it doesn't warrant removing the directive.

Pchelolo added a comment.EditedFeb 1 2017, 1:19 AM

That presumes the request has reached the cache in the first place, which is not true for offline mode, While cache here is rather open to interpretation and can include the client's cache, I still think that offline mode is such a special case that it doesn't warrant removing the directive.

I'm not really sure how the client cache is different from any other cache

The RFC says: https://tools.ietf.org/html/rfc7234#section-4.2.4

A cache MUST NOT send stale responses unless it is disconnected (i.e., it cannot contact the origin server or otherwise find a forward path)

So basically for normal case (connected) must-revalidate and absence of it should have no difference at all.

Mholloway added a comment.EditedFeb 1 2017, 1:26 AM

A cache MUST NOT send stale responses unless it is disconnected (i.e., it cannot contact the origin server or otherwise find a forward path)

So basically for normal case (connected) must-revalidate and absence of it should have no difference at all.

IMO this is right, and we shouldn't need to mess with the headers coming from the server side.

(We will have to mess with our OkHttp client cache, which is implemented to comply with RFC 7234, to allow it to conditionally give us must-revalidate content when offline, but that's a separate matter.)

(Edit: for technical correctness)

Making this change doesn't seem to be necessary for the Android client or desirable in the general case. Anyone object to declining?

I'm afraid I can't champion this effort but I stand by the principal that clients should be able to respect the header directives issued by the server. As far as I know, the current implementation makes this impossible for offline-friendly applications.

From a practical perspective, I think the biggest question is how common clients behave these days when must-revalidate is omitted, and the client cache timeout expires. My memory on this is rather foggy, but I *think* in the dark ages behavior in that area was inconsistent, with early IE versions not re-validating even when they were online. If we can verify that all browsers we care about do the right thing (check as if must-revalidate was set when connected), then dropping must-revalidate in the headers would be harmless.