Page MenuHomePhabricator

Use Service Worker cache when available for ResourceLoader caching instead of LocalStorage
Open, LowestPublic

Event Timeline

Gilles raised the priority of this task from to Needs Triage.
Gilles updated the task description. (Show Details)
Gilles added subscribers: Gilles, Aklapper.
Krinkle triaged this task as Medium priority.Jun 10 2015, 10:36 PM
Krinkle set Security to None.
Krinkle moved this task from Inbox to Backlog on the MediaWiki-ResourceLoader board.

Change 219960 had a related patch set uploaded (by Gilles):
[WIP] Setup a ServiceWorker

https://gerrit.wikimedia.org/r/219960

I think I'm running into a Chromium bug/limitation in regards to self-signed certificates (working on this on a vagrant VM). The ServiceWorker produces net::ERR_INSECURE_RESPONSE errors, but the initial load of the worker doesn't and neither does accessing the URL directly (granted that the cert has been trusted OS-wide).

Wasn't a bug, just a confusing error message.

@Gilles Per your update last week about the overhead of ServiceWorker, perhaps try using the CacheStorage in the main window context (e.g. where we access localStorage currently).

While the main context has no fetch event to intercept, we don't need one since we already tunnel requests through mw.loader. Similar to we we'd have to do in the ServiceWorker thread, we'd build a load.php url for each module, and use that as cache key. Should simplify things a bit as we won't have to duplicate encoding/decoding of the modules query parameter.

I thought I recalled Jake Archibald stating that CacheStorage may not remain exposed to the window. It's explicitly defined as exposed in the current working draft, though.

From the horse's mouth:

gilles: I vaguely recall that you mentioned that CacheStorage might not stay exposed to the window. Am I misremembering? Is it a good long-term bet to rely on its availability?
JakeA: that debate is still ongoing, but even if it becomes read only, it's a small patch to postmessage to the SW and tell it to do the caching

We should assume that if we start relying on it, we might have to start paying the tax of the ServiceWorker's startup overhead at some point.

Use cases for ServiceWorker within ResourceLoader:

  • Having atomic storage.
    • Already with localStorage.
  • SW Cache API has more capacity (citation needed).
  • SW Cache API integrates better with Dev Tools (simulates network request, no use of eval).
  • Intercept request.
    • Already with mw.loader.
    • Though SW would allow us to intercept the first non-js stylesheet request as well. That's not a priority and we already cache that in the regular browser cache.
  • Fan out batch requests into separate requests for HTTP/2 and improved browser cache use, and Varnish cache usage. (less fragmentation)
    • This is valid (see T117682), but we intend to do that client-side already. And we need to do it there anyway for the first view, at which point SW would never see a batch request because we simply never make them. Win-win.

Anticipated issues:

  • Unable to warm up cache on first view (caches are accessible on Window but might not stay).
  • Overhead of maintaining service worker code.

Basically the only tangible cost-effective improvement at this point is T66721 (freeing up localStorage) which we can simply do by using the ServiceWorker Cache interface from the main window context.

I really love the idea of ServiceWorker and there are many more use cases (outside ResourceLoader) that do apply to Wikimedia. Such as T116126 and T111588. Especially around having an app shell, and composing page responses from cheap content API requests and skin templates.

Gilles lowered the priority of this task from Medium to Low.Dec 7 2016, 8:35 PM

Change 219960 abandoned by Krinkle:
ServiceWorker caching ResourceLoader requests

Reason:
Closing for now. A future iteration on this would likely have to involve using the Cache API within mediawiki.js (e.g. not actually with a Service Worker). Or, alternatively, if we do go with service worker, we'd have to communicate the startup manifest to the SW thread and have it re-create what mw.loader.work() does, e.g. parse the url, check cache for each module, re-create url for remainder, wait for response, concat and return to browser. While that would have the benefit of showing network requests even when there is a local js-cache hit, it has a significant downside which is that no JS executes until the cache-miss batch comes back, which doesn't seem useful, it'd be more useful to keep the logic within mediawiki.js so that we can actually start responding with modules right away (given there is no streamed execution of JS responses).

https://gerrit.wikimedia.org/r/219960

Krinkle lowered the priority of this task from Low to Lowest.Oct 20 2020, 8:05 PM

Upstream Firefox recomend using the ServiceWorker Cache API instead of localStorage, which is currently not subject to this bug.

https://bugzilla.mozilla.org/show_bug.cgi?id=1064466#c37

In other news, upstream V8/Chromium also recommended the same thing when I asked about bytecode caching (tweet).

The main reason I was hesitant on this, if I recall correctly, was that:

  1. Reading and writing to the Service Worker Cache API from the main thread (instead of an actual Service Worker) seemed sketchy in terms of standard and specification. But, I think it has settled on being staying exposed, so this could be viable.
  2. I'm assume that reading from the Cache API and then feeding to eval() will have the same lack of bytecode caching as today, so we'd need the response to be initiated by the browser instead of directly programmatic, e.g. a script tag with a URL. But.., for the browser to serve that the Cache API (instead of the regular HTTP cache and network), we'd need to have an actual Service Worker, which then brings in all the same complexities and state management that we were trying to avoid (e.g. telling the main thread to disable batching, and then somehow debouncing the requests if they're not a cache hit, and re-creating the batch logic in the Service Worker).

Another option is IndexedDB, which is available in all our Grade A browsers now we have dropped IE11. It has a more complex API than localStorage and is async, but has much larger quotas.