[GOAL] Make Wikipedia more accessible to 2G connections
Closed, ResolvedPublic

Description

See https://m.mediawiki.org/wiki/Reading/Web/Projects/Barack_Obama_in_under_15_seconds_on_2G for further supporting information.

See https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q3_Goals#Reading for Reading goals for Q3 FY 2015-2016 (January - March 2016).

Acceptance criteria

  • Reduce data usage in mobile web beta by 20%
  • Reduce first paint / time to interact to 5 seconds or less at the median article size and to 10 seconds or less at the 90th percentile article size on simulated / controlled 2G connections (lag excluded) on mobile web beta.
  • Continues to support Wikipedia Zero

Description

During Q2 FY2015-2016 (October - December, 2015), the WMF Reading Web team conducted R&D on speed improvements with a Node.js driven composition layer and ServiceWorker, along the lines of T111588: RFC: API-driven web front-end and T106099: RFC: Page composition using service workers and server-side JS fall-back. This was to get a handle on addressing problems exemplified by things like the Barack Obama article on a 2G connection rendering lengthily at 50 seconds.

By shrinking the HTML to just show the first section of the article and delay rendering of images we can get that down to 11s.

It thus follows that with some radical changes to the HTML we serve our users we should be able to get an article loading on a 2G connection much faster. This would be amplified through use of ServiceWorker, but we are not planning to implement ServiceWorker in Q3 just yet (although maybe later), based on feedback at the Wikimedia Developer Summit in early January 2016 (see T114542: Next Generation Content Loading and Routing, in Practice for history).

This task is a Q3 goal to apply the findings from Q2 and the discussion at the summit to the mobile web beta channel in Q3 (January - March 2016). Q4 is the target for subsequently progressively rolling this into the mobile web stable channel..

Google already optimises our pages and they shouldn't have to do that . Their data shows speed enhancement resulted in a 50% increase in traffic (as well as a 80% reduction in bytes). We would expect similar numbers to be reflected in our changes.

Objective

Make Wikipedia more accessible to 2G connections. We plan to implement a web experience in mobile web beta that leverages service oriented architecture (SOA) and accommodates slow (e.g., 2G or congested networks) and unreliable (e.g., mobile cellular) connections without introducing unnecessary drag on Wikimedia origin server and edge cache infrastructure.

Rationale

The website is unacceptably slow on slow connections. The Global South, where slow connections are more prevalent, is the core strategic focus of the Reading Web team. The necessary new architecture happens to be a great opportunity to employ server resources more judiciously and position us for faster innovation in the future, too.

Key result

The new experience will

  • Reduce data usage in mobile web beta by 20%
    • Drive down bandwidth consumption of in-article raster images. For <noscript> browsers requiring a tap through to view images may be preferable to displaying images in-article; for JavaScript supporting, but jQuery-ResourceLoader incompatible browsers <script> tags may be advisable.
    • Drive down content size and network round trips for the common cases.
  • Reduce first paint / time to interact to 5 seconds or less at the median article size and to 10 seconds or less at the 90th percentile article size on simulated / controlled 2G connections (lag excluded) on mobile web beta.
  • Attempt to drive down full article load (non-image media) with pages of median size on 2G connections to under 15 seconds plus network lag time.
  • If time permits, on supported browsers, when a connection is unavailable upon back/forward navigation to a previously retrieved page, not result in an error message.
  • If possible, stop fragmenting the cache on a per-user, per-article basis.

Dependencies

  • Potentially the Security team for security review of associated components
  • Architecture team for consultation
  • Performance team for patch consultation and review
  • Services and Parsing teams for service layer concerns on an as needed basis
  • Potentially Analytics team for review of updated pageview and other impression-level data as required

Additionally:

  • Technical Operations patch review support will likely be needed for VCL or Apache configuration file changeset review.
  • Release Engineering may need to be consulted in an hoc fashion for consultation on unit/TDD test approach and deployment process
  • Partnerships may need to facilitate discussions with search and OS providers indexing content, and should continue to spot check Wikipedia Zero compliance occasionally.

Assumptions/risks

  • Wikipedia Zero support will remain intact. In practice, this likely means a continued cache split across W0 on/off status in order to support <noscript> UAs (n.b., inlined <script> is also used in W0 to support ResourceLoader-impaired devices).
  • High performance APIs
    • References Ajax endpoint
    • References HTML page endpoint for RL-incompatible UAs.
    • Reference subscript lookup
    • Link preview (generic service already exists, but will need a few tweaks) if deemed appropriate.
  • Potentially the ability to split origin side parser cache objects; although RESTbase objects with edge caching may achieve similar means if origin side parser cache object splits in PHP are not realistic.
  • Some api.php endpoints will likely out of short term convenience need to be consumed directly by Ajax supporting browsers even though in the future optimized RESTbase services may make more sense. Others should likely be fronted by RESTbase to get the performance and simplicity benefits.
  • SEO must be accounted for. Formally speaking, mobile web beta pages should be (and if deemed necessary, explicitly must be) considered non-indexable, but the architecture must be built in a manner that will support proper search engine indexing. This likely entails a review of existing known popular search engine documented behaviors in available regions plus consultation with a key set of search engine and OS providers in the Global North and a small set of search engine providers in the Global South (particularly in Eurasia). More specifically, semantics or implicit behaviors must be in force whereby incremental loading doesn't result in content becoming effectively invisible in search engine graphs.
  • For Q3, the locus of view controller logic will continue to be MobileFrontend via conventional MediaWiki PHP, ResourceLoader (RL), and also additionally most likely some enhancements to RL/ESI/both involving inlined presentational items like CSS and JS.
  • <noscript> and jQuery-incompatible browsers do not need every JavaScript feature emulated at the server, but they do need sensible fallbacks, particularly those that support the core reading experience. As a general rule simpler is better both for the user and the software engineering team for such devices, but there's usually a bit of hard work in this space.

Future considerations

  • Progressive web apps and other packaging may make sense in a future state depending on user desire or perceived user benefit for home screen placement and servicing the needs of different users (e.g., desktop packaged apps).
  • Throughout Q3 and Q4 learnings should be gathered for consideration for the desktop form factor, where there is a much more complex ecosystem of gadgets.
  • This is really out of scope for Reading Web, but cannot be ignored and should instead be considered and planned: VisualEditor is already a sophisticated JavaScript system reliant upon Parsoid/RESTbase. At last check, the plan was to replace the exiting MobileFrontend-integrated VisualEditor with something better tied directly into the VisualEditor codebase. To date, the fuller VisualEditor software architecture is said to presuppose a fuller Parsoid markup than what might be desirable in order to be bootstrapped quickly when a user chooses to initiate editing on a longer document, especially on mobile devices. If the current reading DOM already had all the markup, part of this bootstrapping cost could likely be avoided. But downloading the full Parsoid output on mobile devices is questionable; alternative approaches may include: (1) when there is a strong signal that the user may be predisposed to edit (e.g., logged in with edits), for reading actually get the full Parsoid markup when the connection profile and device/UA characteristics strongly signal the reading experience will still be performant (but do apply transforms to simplify the layout for reading anyway), (2) inform the user when transitioning into editing mode that it will take a while to load (and offer a graceful cancel button in case loading is actually taking too long), or (3) adapt VE to lighter weight markup, probably on some sort of section-by-section basis (there are a number of technical challenges and UX tradeoffs that follow). It would be interesting to consider the notion of a a data savings toggle setting, which may play nicely into #1 and #2; for active editors who activate they're good to go. Additionally, there are alternative/complimentary contributory models, many of which now can even be done in newer devices with ease, perhaps even as JavaScript web applications, although packaging to free up screen real estate may be necessary for the greatest impact:
    • Smaller targeted tasks may be better for many form factors, particularly really small ones.
    • Touch interface-oriented tasks may also be better suited to various mobile form factors (e.g., object relational modeling or gesture based moderation, subselection, and filtering, tasks).
    • And there's always <textarea> to cut out the overhead and for certain types of users who don't mind pinching and zooming as long as their typing and selection cursors will work predictably.

Future future considerations

The Q3 work is not as dramatic as the original R&D and summit proposals. But we should start considering potential further enhancements.

For example, for non-JavaScript users we could simply serve a "Read full article" link at the bottom of the page that when clicked would load the page via a query string parameter, such as fullarticle=1. To do this we would most likely need to partner with the Operations and Parsing teams since this is likely to fragment our cache / require us caching two versions of the article on mobile along with the PHP desktop cache. In this case we would likely need to:

  1. Switch to Parsoid as the parser for the mobile skin from the PHP Parser.
  2. Aggressively reduce HTML sent down the wire and use the Parsoid API to defer load it.
  3. Further grow performance testing infrastructure to be able to demonstrate the difference
  4. Optimise further page visits by making use of client side ServiceWorkers to minimise the data we need to send down the wire.

In this future state:

  • Minimally, there would need to be a content service layer (RESTbase/Node.js) with key endpoints that exhibit the properties of performant edge caching while observing origin side Parser initiated purges. For example:
    • The first part of an article. The object should be retrievable in both a full Parsoid output representation and a more bandwidth conserving representation.
    • The other part of an article. The object should be retrievable in both a full Parsoid output representation and a more bandwidth conserving representation.
  • There may need to be a setting available to a user labeled "Data savings" that is default on, but when turned off would use fuller HTML with less late bound content stitching, for example, it would:
    • Return to the legacy image downloading techniques (e.g., not downsampled, srcset bearing, downloaded immediately instead of in a lazy fashion for <noscript> browsers and immediately after first article portion for JavaScript capable browsers)
    • Obtain full Parsoid HTML (probably with styles that are sufficient for larger mobile form factors and higher, but that don't remove as much of the "unnecessary stuff")
  • In the future the locus of control for view controller logic would more ideally be or less exclusively by a framework like React or via a homebrewed ServiceWorker composition. Even in this case there would likely be at least some level of hybrid usage in the short term. As but one example, the PHP-based MediaWiki i18n subsystem is probably best left as it is: a PHP-based MediaWiki i18n subsystem that can be consumed on demand (either by middleware or a browser). And yet other composition tasks may be better facilitated via server-side ServiceWorker (as referenced earlier, see discussion at T106099).
  • The ability to fully deploy this more dramatic sort of change to the stable channel in a subsequent quarter may hinge in part on data center replication (i.e., partial rollout to the stable channel may be required instead until there is proper data center replication).

Related Objects

StatusAssignedTask
DuplicateJhernandez
Resolveddr0ptp4kt
OpenNone
Duplicatedr0ptp4kt
ResolvedJdlrobson
DeclinedNone
ResolvedJhernandez
DeclinedJdlrobson
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedPeter
DuplicateNone
ResolvedBBlack
ResolvedJhernandez
OpenNone
ResolvedJdlrobson
ResolvedJdlrobson
DeclinedNone
ResolvedTbayer
Resolvedori
ResolvedJdlrobson
There are a very large number of changes, so older changes are hidden. Show Older Changes

So, the proposal is to reduce the amount of time to fully load the Barack Obama article by changing the definition of "fully"? Sorry if I'm not completely understanding why this is an important thing to do.

I'm trying to document a narrative here - https://www.mediawiki.org/wiki/Reading/Web/Projects/Barack_Obama_in_under_15_seconds_on_2G
I'd appreciate your insights and thoughts on that and

This is fair and in reflection I think the goal possibly needs to be reworded. Really time to fully load is a proxy here to "time to get JavaScript loaded". This would allow us to explore offline capabilities, cache the skin chrome to speed up future visits and also provide a better experience to users by allowing them easier access to secondary page functions such as other languages/search - these work without JS but can be used much more efficiently with JavaScript.

Our data (see link) suggests that most readers are satisfied with the lead section so defer loading the rest of the page seems like the thing to do.

I mean, if the aim is to display the first section in under 15s, then we've already achieved that: if you look at the filmstrip view on http://www.webpagetest.org/result/151130_E9_8D8/ you find that it is done in 13s. The proposal is to degrade the user experience compared to this baseline by stopping the download at this point instead of continuing to load the rest of the article. That might help to reduce the data volume, but that is not a stated goal. The goals are stated in terms of user-perceived latency, but like I say, the proposal apparently only achieves an improvement in latency by changing the way it is measured.

15s still feels too long for a first paint and I think we need to get that down as far as possible. If the majority of readers are simply reading the lead section (as our data shows) then we should optimise for that no? I agree though images is a low hanging fruit and something that can be explored now in parallel.

GWicke added a comment.EditedDec 7 2015, 10:59 PM

I agree with @tstarling that it's not clear that first-section loading is such a big win. As mentioned before, first paint time on the desktop site is actually fairly decent when disabling (or deferring) image loading. I would not be surprised if the numbers were very similar compared to first-section only loading.

In either variant, first paint could be further improved by reducing the time we block on CSS, by inlining CSS or doing something fancy like HTTP2 push. On large pages like Obama, bandwidth consumption for HTML could also be reduced by omitting optional / default-hidden elements like navboxes and references on first load on slow connections.

Streaming HTML on 2G test, steps to reproduce:

Results:

  • First paint at about 5s.
  • Full navigation tabs visible at 11s.
  • Finished loading at about 12s.

I agree with @tstarling that it's not clear that first-section loading is such a big win. As mentioned before, first paint time on the desktop site is actually fairly decent when disabling (or deferring) image loading. I would not be surprised if the numbers were very similar compared to first-section only loading.

I ran a test on Barack Obama, and if first paint is just the goal yes the gain is marginal (around 1s).
Full site: http://www.webpagetest.org/result/151207_QH_18XT/
Lead site: http://www.webpagetest.org/result/151207_SG_182S/

That said if you want to optimise for JS having loaded or data downloaded just loading the lead section does have benefits (half the kb).

In either variant, first paint could be further improved by reducing the time we block on CSS, by inlining CSS or doing something fancy like HTTP2 push. On large pages like Obama, bandwidth consumption for HTML could also be reduced by omitting optional / default-hidden elements like navboxes and references on first load on slow connections.

Streaming HTML on 2G test, steps to reproduce:

  • In the Chrome network console, select "Regular 2G (250k)" as the network profile.
  • Disable image loading in your preferences.
  • Record a timeline of visiting https://en.wikipedia.org/wiki/Barack_Obama

    Results:
  • First paint at about 5s.
  • Full navigation tabs visible at 11s.

Are they interactive at this point? Can you click things?

  • Finished loading at about 12s.

Seems much lower than I'd expect. Anything getting cached? Also how are you defining "finished loading" here?

tstarling added a comment.EditedDec 8 2015, 11:35 PM

The 840ms RTT has a very large effect on the time to first paint, in the "Mobile Edge" speed profile that @Jdlrobson and I are using. In this waterfall it takes almost 4s to receive the first byte of the HTML, and then two new connections for CSS are opened shortly afterwards, and it takes those two connections 4s to set up as well. So it is only at 8s that bandwidth and object size becomes important. The CSS requests are small, but probably face packet loss due to the HTML connection which is saturating the link at this point. Any packet loss would delay the CSS requests severely, due to the high RTT.

That's why the repeat view gives a first paint in only 3.3 seconds, despite still downloading the full HTML. The CSS is cached, and so connection establishment for the CSS does not delay rendering.

I would be interested to know what the time to first paint is if HTTP/2 is used.

tstarling added a comment.EditedDec 9 2015, 12:50 AM

I wonder how representative the 840ms RTT really is. In Chrome Developer Tools, the "Regular 2G" preset has an RTT of only 300ms. Loading the article in question with that throttling preset, using my phone with using Chrome's remote debugging feature, I got first paint in 2.9s.

Here's the timeline

Just a note, I've seen the browser (chrome 48) on my nexus 5 freeze for seconds while rendering the huge html that some of our pages have (on WiFi). More concretely I was loading the Barack Obama's article from https://en.wikipedia.org/api/rest_v1/page/html/Barack%20Obama

Take into account that mobile devices are very resource constrained computers, and a huge DOM has a toll on performance, on the CPU usage and UX.

I would guess that if the network is slow enough the phone would have time to parse, layout, etc. the html as it comes without being bottlenecked on CPU usage. If the network is fast, then the CPU/graphics card can become a problem.

Those are the reasons mainly (besides payload size) for considering sending smaller chunks of HTML or loading/rendering pieces of the page on demand. I'm not sure how I would really measure and show what I've experienced, but I wanted to note that webpagetest is not the only factor to take into account when developing for mobile devices.

GWicke added a comment.EditedDec 20 2015, 8:03 PM

Some more data, most of it using relatively bloated Parsoid HTML including data-mw:

Discussion

The data illustrates the dominant influence of linked CSS resources on first paint performance on a cold load. Linked CSS suffers from a late start (only after the HTML head is loaded), and contention with the parallel HTML load. Inlined CSS avoids this, immediately unblocking the browser to progressively render HTML as it arrives. Longer term, HTTP/2 push can let us achieve the same effect without duplicate CSS downloads on subsequent navigations [1].

A surprising result is that Chrome already defers loading of below-fold images, at least if CSS is available to determine above / below fold status. Time to a rendered & interactive first screen is almost unaffected by image loading if CSS is inlined or generally loaded before above-fold images start loading.

On a Galaxy Note 3 using a wifi connection, Chrome renders the first screen of Obama with images and inline styles after about a second. The full page load takes about six seconds. CPU does not seem to be a bottleneck for first paint on this ~2 year old device. Scrolling is smooth all the way through the rendering phase.

[1]: Early cache-aware HTTP/2 push implementation: http://blog.kazuhooku.com/2015/12/optimizing-performance-of-multi-tiered.html

Peter added a comment.Dec 21 2015, 9:25 AM

Think inline the CSS and lazy load the rest of the CSS for next view is the way to go since it's unclear how push will be implemented (and how we actually make it push what we want). Inline will help for HTTP/1 also so we got a win/win.

@GWicke I was wondering, did you inline just above-the-fold CSS and lazy-load the rest or completely inline all styles (urgh) – what exactly was your inlined CSS form the example page, which technique did you use for coming up with it (like Critical or CriticalCSS on CSS files output after RL)? @Peter

GWicke added a comment.EditedDec 21 2015, 6:06 PM

@Volker_E: I manually replaced the style <link> with the actual CSS returned by ResourceLoader, wrapped in <style>. See the source of https://people.wikimedia.org/~gwicke/Barack_Obama_images_inline_styles.html.

@GWicke In an optimal world, we'd want to get the DOM & the CSSOM for above-the-fold content fitting within 14KB for the browser to paint after one roundtrip and lazy-load the rest.

@Volker_E: Agreed; there is definitely room for improvement. However, I think it's encouraging that even the naive approach gets us most of the way there. With gzip compression, inlined desktop styles & above-the-fold content weigh in at about 23k, so not that far from the 14k target.

jmadler added a subscriber: jmadler.Jan 6 2016, 5:13 AM
Jhernandez renamed this task from [GOAL] Barack Obama article on English Wikipedia on 2G connection should fully load in under 15s (down from 50s) to [GOAL] Web: Make Wikipedia more accessible to 2G connections.Jan 12 2016, 12:40 AM
Jdlrobson updated the task description. (Show Details)Jan 12 2016, 12:57 AM
Jdlrobson renamed this task from [GOAL] Web: Make Wikipedia more accessible to 2G connections to Make Wikipedia more accessible to 2G connections.Feb 5 2016, 1:25 AM
dr0ptp4kt updated the task description. (Show Details)Feb 13 2016, 2:22 PM
dr0ptp4kt renamed this task from Make Wikipedia more accessible to 2G connections to [GOAL] Make Wikipedia more accessible to 2G connections.
dr0ptp4kt added subscribers: BBlack, mark, bd808 and 9 others.
Danny_B moved this task from Tag to Epic on the Tracking board.May 27 2016, 4:55 PM
wychen added a subscriber: wychen.Mar 2 2017, 7:27 PM
Jdlrobson closed this task as Resolved.Apr 13 2017, 11:07 PM
Jdlrobson claimed this task.

This task has served its purpose. Note a decision on T123328 is still needed. By resolving this I'm not saying the 2G experience is great, but the project has run its course. More specific projects/goals will follow.