Page MenuHomePhabricator

Tune WDQS caching headers
Closed, ResolvedPublic

Description

At the moment, WDQS exposes quite a few caching headers, some of them not making sense. For example, ETags are generated by nginx and are different on each nginx server (nginx generates ETags based on last modified time, which is not garanteed to be coherent).

As we are now adding a hash of the file content to page resources (CSS, JS, ...), we can aggressively increase the caching time of those resources.

Queries themselves are probably only cacheable for a very fairly short time.

Some discussion on this has already happened on T133026. @BBlack input on this would be more than welcomed!

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript

Yes, after the switch to built GUI we can cache at least the hashed CSS/JS pretty much forever, they'd never change. Non-hashed ones probably fine with time like 1 day or so, we change them about once a week. If we want to have different rules, it may require some nginx configs, but should be possible.

Not sure about etags, didn't study the subject in depth.

Queries are now cached for 5 mins, I think we can keep it this way unless we find any reason for change.

It seems that at the moment only .js and .css files which ar in the js / css directories are processed by filerev. Is there any reason to not process other files as well? It seems that we have .json files that could benefit from the same processing.

Change 293492 had a related patch set uploaded (by Gehel):
Don't publish etags for WDQS

https://gerrit.wikimedia.org/r/293492

We can only use filerev for files that are referenced from the html file, because the application itself is not aware about filerev and then will not find its resources.

The json files are for i18n, so there will be no problem when they are outdated.

We shoudl be able to also process at least:

  • vendor/jquery.uls/css/jquery.uls.css
  • logo.svg

They are unlikely to change frequently but still it would be nice...

At this point I'm not entirely sure which resources can be cached and which cannot. My guess:

  • main page (index.html) -> no cache (or very short)
  • everything under /css and /js -> long cache (7 days to start ?)
  • everything else -> short cache (5 minutes?)

Does this sound reasonable to you?

We shoudl be able to also process at least:

  • vendor/jquery.uls/css/jquery.uls.css

This is fixed.

  • main page (index.html) -> no cache (or very short)
  • everything under /css and /js -> long cache (7 days to start ?)
  • everything else -> short cache (5 minutes?)

Does this sound reasonable to you?

Yes, @Smalyshev ?

I am not sure what the general strategy should be, but I guess we don't need aggressive caching at the moment, because we don't have a lot of load...

  • main page (index.html) -> no cache (or very short)

I would cache it for the same as below. It doesn't change that much - pretty much once a week now. While re-fetching it is no big deal, I think we still can cache it for like 5 mins.

  • everything under /css and /js -> long cache (7 days to start ?)
  • everything else -> short cache (5 minutes?)

Makes sense.

Change 293492 merged by Gehel:
Don't publish etags for WDQS

https://gerrit.wikimedia.org/r/293492

@Gehel is anything left to do in this ticket?

Change 306163 had a related patch set uploaded (by Gehel):
WDQS caching headers

https://gerrit.wikimedia.org/r/306163

@Smalyshev yes there is: adding some cache-control headers. Change submitted. Thanks for reminding me!

Gehel claimed this task.

New caching headers deployed. Checked with chrome, cache-control headers are showing up.