RFC: Page composition using service workers and server-side JS fall-back
Closed, DeclinedPublic
Actions

Assigned To

None

Authored By

	• GWicke
	Jul 16 2015, 11:27 PM

Description

Preamble

We have been looking for ways to expand our cacheable content beyond anonymous requests for a long time. Once a user is logged in, a number of personalizations primarily in the chrome around the content (user name, tabs, links) make it hard to reuse a cached copy of an entire page. Initial trials to perform those personalizations with ESI were done as early as 2004, but even with Varnish testing we have seen performance and stability issues. Server-side composition technologies like ESI or SSI also introduce a second code path, which makes it harder to test and develop front-end code without intimate knowledge of a complex part of our stack.

An alternative is to use JavaScript for the composition. This opens up the possibility of running the same JS code

on the client, in service workers (essentially caching HTTP proxies running in the browser), or
on the server, behind edge caches, in a JS runtime like Node.js with an implementation of the ServiceWorkers API, processing both cache misses and authenticated views.

By using JavaScript, we get to use familiar and mature HTML templating systems that have support for pre-compilation. This simplifies the development and testing process. While Varnish performance drops significantly with each ESI include (we measured 50% with five includes), pre-compiled JS templates can potentially perform fairly fine-grained customizations with moderate overhead.

In browsers that support it (like current Chrome, about 40% of the market), we can preload templates and styles for specific end points and speed up performance by fetching the raw content only. By working as a proxy and producing an HTML string, we also avoid changes to the regular page JavaScript. In contrast to single-page applications, we don't incur routing complexity and heavy first-load penalties.

An interesting possibility is to prototype this in a Service Worker targeting regular page views (/wiki/{title}) only, while letting all other requests fall through to the regular request flow.

Proposal

This task is about implementing a minimal service that essentially composes an HTTP response based on multiple other resources that are themselves more cacheable and less variable. Initial design requirements:

High-throughput. Suitable for handling traffic at the edge, essentially doing only HTTP and string manipulation.
Request routing.
HTML Templating. Fetch, precompile and cache templates - presumably Mustache.
Streamable. Must implement template handling so flushing starts early and continues progressively.
Export as ServiceWorker. Add an endpoint that exports a JavaScript program compatible with a ServiceWorker that contains all the utilities (elematch, mustache etc.), Router, RequestHandler, and the current install's router configuration and custom RequestHandlers.

The Node.js service itself should probably use wikimedia/service-runner and not be specific to MediaWiki in anyway. The service would only know which domain names it serves, and from where it can fetch the sw.js executable.

Dependencies

Making Service Workers work for wiki page views, is impossible without first resolving a substantial amount of technical debt.

I suggest the initial implementation is used for a less complicated use case. For example, the Wikipedia.org portal, which can justify the Page Composition Service to improve their localisation workflow, which is currently not very performant due to client-side XHR, causing FOUCs.

Related work by @GWicke and @Krinkle:

node-serviceworker-server: Node library that runs an HTTP service. It can be configured to map a domain and request scope to a service worker url. The service will use the node-serviceworker library to turn the service worker script into something that we can instantiate and make a request to on the server-side.
- Status: Work in progress (https://github.com/gwicke/node-serviceworker-proxy)
- Task: T116126: Provide server-side ServiceWorker interfaces
node-serviceworker: Node library that provides a Node sandbox with several browser APIs available in its scope (such as Fetch, ServiceWorker Cache, and more).
- Status: Work in progress (https://github.com/gwicke/node-serviceworker)
- Task: T116126: Provide server-side ServiceWorker interfaces
sw-wikimedia-helpers: Collection of utilities we expect most of our ServiceWorker clients to need. Such as request routing, view abstraction, and a command-line script to generate a compact sw.js file. This utility library will likely make use of:
- browserify
- mixmaster: Produce a readable stream from an array of string literals, functions, promises, and other streams. With the option to pass through one or more transforms. This allows progressively streaming to the client with the ability to dynamically substitute portions, and to precompile any templates.
- elematch: Efficient matching of elements in a stream of HTML. To be used with Mixmaster. This would allow to progressively stream to the client with the ability to dynamically substitute portions.
- musti: Streamable Mustache renderer. Uses Mixmaster.

Related Objects
Search...

Status	Subtype	Assigned	Task
Declined		None	T125920 [EPIC] Future exciting reading web performance endeavours
Declined		None	T111588 RFC: API-driven web front-end
Duplicate		None	T138177 Content jumps after JavaScript is loaded
Open	Feature	None	T52865 CentralNotice shifts down page content on load (causes mis-clicks)
Stalled		None	T96797 Top-queue stylesheets should be versioned for improved caching
Declined		None	T106099 RFC: Page composition using service workers and server-side JS fall-back
Resolved		• GWicke	T116126 Provide server-side ServiceWorker interfaces

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

• GWicke triaged this task as Medium priority.Feb 1 2016, 7:20 PM

• GWicke removed a project: Wikimedia-Developer-Summit-2016.

FreedomFighterSparrow subscribed.Feb 7 2016, 5:35 PM

daniel mentioned this in T125899: Spike: Should the mobile site vary the ParserOutput cache?.Feb 9 2016, 5:08 PM

• Jonas subscribed.Feb 9 2016, 8:33 PM

daniel mentioned this in T125391: [Task] Investigation: Remove wbentity blob on mobile.Feb 10 2016, 11:14 AM

dr0ptp4kt mentioned this in T113066: [GOAL] Make Wikipedia more accessible to 2G connections.Feb 13 2016, 2:22 PM

• GWicke updated the task description. (Show Details)Mar 21 2016, 7:29 PM

Krinkle added a parent task: T130779: [EPIC] Make Wikipedia pages available offline on mobile when ServiceWorker enabled.Mar 30 2016, 1:36 AM

Krinkle removed a parent task: T101731: Leverage Service Workers for performance improvements.

Qgil unsubscribed.Mar 30 2016, 9:35 AM

RandomDSdevel subscribed.Apr 12 2016, 9:19 PM

Danny_B added a project: Proposal.May 2 2016, 10:39 PM

• RobLa-WMF mentioned this in Unknown Object (Event).May 4 2016, 7:33 PM

• RobLa-WMF mentioned this in E187: RFC Meeting: triage meeting (2016-05-25, #wikimedia-office).May 25 2016, 7:03 AM

• GWicke updated the task description. (Show Details)Jun 7 2016, 9:47 PM

Krinkle created subtask T140660: Implement composition service library.Jul 18 2016, 5:15 PM

Krinkle mentioned this in T140664: Achieve predictable MediaWiki routing and cacheable skin data.Jul 18 2016, 5:43 PM

Krinkle mentioned this in T34618: MediaWiki should support partial page caching (edge side includes).

Krinkle updated the task description. (Show Details)Aug 8 2016, 9:24 PM

Krinkle removed a subtask: T140660: Implement composition service library.

Krinkle merged a task: T140660: Implement composition service library.

Krinkle added subscribers: Scott_WUaS, Agabi10, Aklapper and 3 others.

Krinkle updated the task description. (Show Details)Aug 8 2016, 9:28 PM

There is now a basic node-serviceworker-proxy service running basic serviceworkers like this one via the node-serviceworker package. Registration is dynamic, and can be driven by an API response describing the ServiceWorker scope / URL mappings per domain. The ServiceWorker registrations are periodically refreshed with a background tasks.

Initial throughput numbers are quite encouraging. On a laptop, I am getting 2.6k req/s for a small wiki page. Larger pages are still around 2k req/s. This is with the content page & template cached, to gauge the page composition itself.

The next step will be to hook up more advanced page composition structures @Krinkle is working on, and to expand API coverage as needed.

RandomDSdevel awarded a token.Aug 31 2016, 8:29 PM

A prototype proxy running this serviceworker is now set up in labs. The proxy also automatically registers the same serviceworker code on clients retrieving server-composed HTML content.

Limitations:

All resources are proxied through this service, to avoid CSP issues. In production, Varnish would handle non-HTML resources, and only forward HTML requests from clients without a ServiceWorker to the ServiceWorker proxy.
In the demo, only enwiki is supported via a static host header override. In a real deploy, host headers would work as-is.
While the proxy fully supports response streaming, the used demo ServiceWorker is not streaming-enabled yet. This means that performance on slow connections is not yet as good as it could be. We have created https://github.com/gwicke/mixmaster and https://github.com/wikimedia/elematch in preparation for streaming composition, and plan to integrate this in a ServiceWorker soon.

Some more info from Ilya Grigorik's excellent "Resilient Web App" talk:

Next steps:

Reduce first paint time by taking full advantage of streaming in the prototype ServiceWorker. This will leverage https://github.com/gwicke/mixmaster (streaming transform composition), https://github.com/wikimedia/elematch (streaming HTML element matching), and https://github.com/krinkle/musti (streaming Mustache processing).
Make sure to unblock rendering as early as possible by serving basic styles from cache & inlining them in the head of the response. See also T124966.

ArielGlenn subscribed.Sep 19 2016, 12:36 PM

fgiunchedi subscribed.Sep 22 2016, 3:51 PM

• mobrovac mentioned this in T144814: Services Team Goals 2016/2017 Q2: October - December.Oct 4 2016, 1:50 PM

Updates:

A fully streaming demo ServiceWorker is now implemented at https://github.com/gwicke/streaming-serviceworker-playground/blob/master/lib/sw.js. This uses https://github.com/wikimedia/web-stream-util (formerly mixmaster) and https://github.com/wikimedia/web-html-stream (formerly elematch), which have both seen major refactors to support streaming. ResourceLoader requests are cached aggressively & can be refreshed in the background. On slow connections, this significantly improves first render time by unblocking the render as soon as the first chunk of body HTML comes in.

With a Chrome canary build (55+), you can try this at https://swproxy.wmflabs.org/wiki/Foobar. The code works on Chrome 53 (current stable) and 54 as well, but the header-based registration mechanism we currently use is only supported in 55. This version is scheduled to graduate to stable in December, so it might not be worth offering the (messier) JS registration.

Krinkle mentioned this in T131117: Refactor startup logic to bundle base modules and page modules into one unified request.Apr 27 2017, 9:15 PM

Krinkle mentioned this in T52865: CentralNotice shifts down page content on load (causes mis-clicks).May 11 2017, 2:55 AM

Krinkle added a parent task: T52865: CentralNotice shifts down page content on load (causes mis-clicks).

Krinkle mentioned this in T103695: Provide location, logged-in status and device information in ResourceLoaderContext.May 11 2017, 3:00 AM

AndyRussG mentioned this in T165383: Spike: CentralNotice: General investigation on how we might use service workers with CentralNotice.May 15 2017, 6:54 PM

AndyRussG added a subtask: T165383: Spike: CentralNotice: General investigation on how we might use service workers with CentralNotice.May 15 2017, 6:58 PM

AndyRussG subscribed.Jun 28 2017, 3:54 PM

Krinkle mentioned this in T47980: Make resource files work offline (as through Service Workers) to improve performance and move toward full offline support.Jun 28 2017, 6:26 PM

Krinkle merged a task: T47980: Make resource files work offline (as through Service Workers) to improve performance and move toward full offline support.

Krinkle added subscribers: Brettz9, Ricordisamoa, Samwilson and 2 others.

• GWicke moved this task from Backlog to designing on the Services board.Jul 11 2017, 8:28 PM

• GWicke edited projects, added Services (designing); removed Services.

Krinkle mentioned this in T96797: Top-queue stylesheets should be versioned for improved caching.Jul 12 2017, 12:50 AM

FYI: there is movement on the Safari/Webkit side: https://bugs.webkit.org/show_bug.cgi?id=175115

Joe subscribed.Sep 4 2017, 10:36 AM

There is now a commercial product offering ServiceWorker execution at the CDN level: https://blog.cloudflare.com/introducing-cloudflare-workers/

Functionally, this is very similar to https://github.com/wikimedia/node-serviceworker and https://github.com/wikimedia/node-serviceworker-proxy.

• GWicke removed • GWicke as the assignee of this task.Oct 11 2017, 10:32 PM

Krinkle removed a project: Proposal.Dec 21 2017, 11:38 PM

This RFC has probably been made obsolete by https://www.mediawiki.org/wiki/Reading/Web/Projects/NewMobileWebsite. ServiceWorkers are an interesting technology, but there does not seem to be any concrete plan to follow the plan proposed in this RFC.

• Imarlier added a subtask: T96797: Top-queue stylesheets should be versioned for improved caching.Jan 18 2018, 3:53 PM

• Imarlier removed a subtask: T96797: Top-queue stylesheets should be versioned for improved caching.

• Imarlier added a parent task: T96797: Top-queue stylesheets should be versioned for improved caching.

• Imarlier added a project: Performance-Team (Radar).

Krinkle updated the task description. (Show Details)Jan 31 2018, 12:19 AM

Krinkle updated the task description. (Show Details)Jan 31 2018, 12:24 AM

Krinkle mentioned this in T89889: RFC: Service split along presentation vs data manipulation line.Jan 31 2018, 12:31 AM

Krinkle added a subtask: T116126: Provide server-side ServiceWorker interfaces.Jan 31 2018, 12:37 AM

Krinkle removed a parent task: T130779: [EPIC] Make Wikipedia pages available offline on mobile when ServiceWorker enabled.

SBisson subscribed.Jan 31 2018, 11:34 AM

Krinkle moved this task from Limbo to Watching on the Performance-Team (Radar) board.Feb 2 2018, 10:56 PM

@daniel since this was made obsolete should it be declined as a RFC then?

Krinkle mentioned this in T194393: Implement standard middleware interface in MediaWiki (PSR-15).May 10 2018, 2:33 PM

@kchapman I think that may've been in response to an outdated note in the Google Doc. I've revised the task significantly in January earlier this year, and positioned it to be separated from T111588.

This RFC as it stands is imho now actionable and not obsolete. The MVP of this could be a lightweight service that runs in edge DCs that is capable of composing responses for any of our application domains. Including, as starting point perhaps, the www portals. That would allow us to keep this stack modular and without any knowledge of MediaWiki. (Our www-portals, such as www.wikipedia.org, actually do not run on MediaWiki, they run plainly as static files from Apache).

The MediaWiki-specific overhaul related to this, has been moved to T111588 and T140664. The outcome of that would be a more modular MediaWiki PHP codebase in which skins are capable of rendering pages quickly for logged-in users.

After that, it's up to the outcome of this task (T106099) to consider whether it's worth re-implementing that then-lightweight Skin system from PHP into something else (e.g. Node.js), and as part of that would be considering whether or not to use the ServiceWorker model server-side, or whether to do it without that.

Nirmos subscribed.May 18 2018, 1:41 AM

Paladox subscribed.Jul 11 2018, 10:14 PM

AfroThundr3007730 subscribed.Jul 16 2018, 3:06 AM

Kaartic awarded a token.Sep 13 2018, 3:49 PM

• mobrovac added a project: Platform Team Legacy (Designing).Dec 20 2018, 12:54 PM

Krinkle removed a subtask: T165383: Spike: CentralNotice: General investigation on how we might use service workers with CentralNotice.Jan 26 2019, 6:37 AM

Scott_WorldUnivAndSch subscribed.Jan 26 2019, 6:09 PM

Milimetric subscribed.Feb 8 2019, 3:38 AM

Brettz95 subscribed.Apr 19 2019, 2:38 AM

ReaperDawn subscribed.Jun 12 2019, 6:06 AM

Amorymeltzer subscribed.Jun 14 2019, 12:58 AM

Tons of data here, but I don't see this being woked on in the nearest future. Moving to the icebox for CPT.

• Pchelolo moved this task from Inbox to Icebox on the Platform Engineering board.Jul 11 2019, 1:16 AM

WDoranWMF edited projects, added Platform Engineering (Icebox); removed Platform Engineering.Mar 23 2020, 2:04 PM

DanielFriesen unsubscribed.Mar 23 2020, 6:00 PM

• Niedzielski subscribed.Apr 20 2020, 1:59 PM

phuedx mentioned this in T246427: [Spike 8hrs] Decide how to persist state of collapsible sidebar across sessions for logged-in users, logged-out users.May 1 2020, 4:26 PM

Akuckartz awarded a token.Jun 20 2020, 4:53 PM

Akuckartz subscribed.

• Demian subscribed.Jun 20 2020, 5:35 PM

Closing old RFC that is not yet on to our 2020 process and does not appear to have an active owner. Feel free to re-open with our template or file a new one when that changes.

CDanis subscribed.Oct 14 2020, 11:31 PM

R4356th subscribed.May 4 2023, 3:17 PM

	F4461576: pasted_file
	Sep 12 2016, 11:51 PM

RFC: Page composition using service workers and server-side JS fall-backClosed, DeclinedPublicActions