Page MenuHomePhabricator

[GOAL] Make Wikipedia more accessible to all connections with new fast API-driven web experience in mobile web beta
Closed, DuplicatePublic

Description

DRAFT

Acceptance criteria

  • Supports Wikipedia Zero

Description

During Q2 FY2015-2016 (October - December, 2015), the WMF Reading Web team has conducted R&D on speed improvements with a Node.js driven composition layer (T113066) and ServiceWorker, along the lines of T111588: RFC: API-driven web front-end and T106099: RFC: Page composition using service workers and server-side JS fall-back.

The R&D results will be discussed at the Wikimedia Developer Summit in early January 2016 in T114542: Next Generation Content Loading and Routing, in Practice.

This task is a Q3 goal to apply the findings from Q2 and the discussion at the summit to the mobile web beta channel in Q3 (January - March 2016). Q4 is the target for subsequently progressively rolling this into the mobile web stable channel.

Objective

Make Wikipedia more accessible to 2G connections. We plan to implement a web experience in mobile web beta that leverages service oriented architecture (SOA) and accommodates slow (e.g., 2G or congested networks) and unreliable (e.g., mobile cellular) connections while reducing drag on Wikimedia origin server and edge cache infrastructure.

Rationale

The website is unacceptably slow on slow connections. The Global South, where slow connections are more prevalent, is the core strategic focus of the Reading Web team. The necessary new architecture happens to be a great opportunity to employ server resources more judiciously and position us for faster innovation in the future, too.

Key result

The new experience will

  • Reduce data usage in mobile web beta by 20%
    • Drive down bandwidth consumption of in-article raster images. For <noscript> browsers requiring a tap through to view images may be preferable to displaying images in-article; for JavaScript supporting, but jQuery-ResourceLoader incompatible browsers <script> tags may be advisable.
    • Drive down content size and network round trips for the common cases.
  • Reduce first paint / time to interact to 5 seconds or less at the median article size and to 10 seconds or less at the 90th percentile article size on simulated / controlled 2G connections (lag excluded).
  • Attempt to drive down full article load (non-image media) with pages of median size on 2G connections to under 15 seconds plus network lag time.
  • On supported browsers, when a connection is unavailable upon back/forward navigation to a previously retrieved page, not result in an error message.
  • Stop fragmenting the cache on a per-user, per-article basis.

Dependencies

  • Security team for security review of associated components
  • Architecture team for consultation
  • Performance team for patch consultation and review
  • Services and Parsing teams for service layer concerns
  • Analytics team for review of updated pageview and other impression-level data as required

Additionally:

  • Technical Operations patch review support will likely be needed for VCL or Apache configuration file changeset review.
  • Fundraising Tech will need to be consulted to identify software architecture considerations for JavaScript based Central Notice invocation.
  • Release Engineering may need to be consulted in an hoc fashion for consultation on unit/TDD test approach and deployment process
  • Partnerships may need to facilitate discussions with search and OS providers indexing content, should spot check Wikipedia Zero compliance occasionally

Assumptions/Risks

  • Wikipedia Zero support will remain intact. In practice, this likely means a continued cache split across W0 on/off status in order to support <noscript> UAs (n.b., inlined <script> is also used in W0 to support ResourceLoader-impaired devices).
  • Minimally, there will be a content service layer (RESTbase/Node.js) with key endpoints that exhibit the properties of performant edge caching while observing origin side Parser initiated purges. For example:
    • The first part of an article. The object should be retrievable in both a full Parsoid output representation and a more bandwidth conserving representation.
    • The other part of an article. The object should be retrievable in both a full Parsoid output representation and a more bandwidth conserving representation.
    • Reference subscript lookup
    • Link preview (generic service already exists, but will need a few tweaks) if deemed appropriate.
  • There will be a setting available to a user labeled "Turn OFF bandwidth savings (recommended only for bigger screens and fast connections)" that once activated will
    • Return to the legacy image downloading techniques (e.g., not downsampled, srcset bearing, downloaded immediately instead of in a lazy fashion for <noscript> browsers and immediately after first article portion for JavaScript capable browsers)
    • Obtain full Parsoid HTML (probably with styles that are sufficient for larger mobile form factors and higher, but that don't remove as much of the "unnecessary stuff")
  • Some api.php endpoints will likely out of short term convenience need to be consumed directly by Ajax supporting browsers even though in the future optimized RESTbase services may make more sense. Others should likely be fronted by RESTbase to get the performance and simplicity benefits.
  • SEO must be accounted for. Formally speaking, mobile web beta pages should be (and if deemed necessary, explicitly must be) considered non-indexable, but the architecture must be built in a manner that will support proper search engine indexing. This likely entails a review of existing known popular search engine documented behaviors in available regions plus consultation with a key set of search engine and OS providers in the Global North and a small set of search engine providers in the Global South. More specifically, semantics or implicit behaviors must be in force whereby incremental loading doesn't result in content becoming effectively invisible in search engine graphs.
  • Depending upon what we learn at the summit, the locus of view controller logic may be best served more or less exclusively by a framework like React or out of an extension such as MobileFrontend (or, perhaps, a replacement extension, let's call it MultiDeviceFrontend). In practice, there will likely be at least some level of hybrid usage in the short term. As but one example, the PHP-based MediaWiki i18n subsystem is probably best left as it is: a PHP-based MediaWiki i18n subsystem that can be consumed on demand (either by middleware or a browser). And yet other composition tasks may be better facilitated via server-side ServiceWorker (as referenced earlier, see discussion at T106099).
  • <noscript> and jQuery-incompatible browsers do not need everything. In some cases ServiceWorker composition may be useful to support such devices, but most of the time simpler is better both for the user and the software engineering team.
  • VisualEditor (VE) for small form factor devices with sufficient JavaScript support would ideally be constructed in a manner compatible with this architectural style. More is discussed below.
  • The ability to fully deploy this change to the stable channel in a subsequent quarter may hinge in part on data center replication (i.e., partial rollout to the stable channel may be required instead until there is proper data center replication).

Future considerations

  • Progressive web apps and other packaging may make sense in a future state depending on user desire or perceived user benefit for home screen placement and servicing the needs of different users (e.g., desktop packaged apps).
  • In Q4 the mobile web beta architecture should be migrated to the stable mobile web channel. As noted earlier, the ability to fully deploy this change to the stable channel in Q4 may hinge in part on data center replication (i.e., partial rollout to the stable channel may be required instead until there is proper data center replication). Throughout Q3 and Q4 learnings should be gathered for consideration for the desktop form factor, where there is a much more complex ecosystem of gadgets.
  • This is really out of scope for Reading Web, but cannot be ignored and should instead be considered and planned: VisualEditor is already a sophisticated JavaScript system reliant upon Parsoid/RESTbase. At last check, the plan was to replace the exiting MobileFrontend-integrated VisualEditor with something better tied directly into the VisualEditor codebase. To date, the fuller VisualEditor software architecture is said to presuppose a fuller Parsoid markup than what might be desirable in order to be bootstrapped quickly when a user chooses to initiate editing on a longer document, especially on mobile devices. If the current reading DOM already had all the markup, part of this bootstrapping cost could likely be avoided. But downloading the full Parsoid output on mobile devices is questionable; alternative approaches may include: (1) when there is a strong signal that the user may be predisposed to edit (e.g., logged in with edits), for reading actually get the full Parsoid markup when the connection profile and device/UA characteristics strongly signal the reading experience will still be performant (but do apply transforms to simplify the layout for reading anyway), (2) inform the user when transitioning into editing mode that it will take a while to load (and offer a graceful cancel button in case loading is actually taking too long), or (3) adapt VE to lighterweight markup, probably on some sort of section-by-section basis (there are a number of technical challenges and UX tradeoffs that follow). As noted in this task, there's a notion of a "Turn OFF bandwidth savings (recommended only for bigger screens and fast connections)" setting, which may play nicely into #1 and #2; for active editors who activate they're good to go. Additionally, there are alternative/complimentary contributory models, many of which now can even be done in newer devices with ease, perhaps even as JavaScript web applications, although packaging to free up screen real estate may be necessary for the greatest impact:
    • Smaller targeted tasks may be better for many form factors, particularly really small ones.
    • Touch interface-oriented tasks may also be better suited to various mobile form factors (e.g., object relational modeling or gesture based moderation, subselection, and filtering, tasks).
    • And there's always <textarea> to cut out the overhead and for certain types of users who don't mind pinching and zooming as long as their typing and selection cursors will work predictably.

Event Timeline

dr0ptp4kt raised the priority of this task from to Needs Triage.
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt added a project: Reading-Admin.
dr0ptp4kt subscribed.
dr0ptp4kt triaged this task as Medium priority.Dec 4 2015, 5:27 AM
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt set Security to None.
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt added subscribers: JKatzWMF, Jdlrobson, phuedx and 11 others.
dr0ptp4kt renamed this task from [GOAL] Fast API-driven web experience to mobile web beta to [GOAL] Make Wikipedia more accessible to 2G connections with fast API-driven web experience to mobile web beta.Dec 5 2015, 12:56 AM
dr0ptp4kt renamed this task from [GOAL] Make Wikipedia more accessible to 2G connections with fast API-driven web experience to mobile web beta to [GOAL] Make Wikipedia more accessible to 2G connections with fast API-driven web experience in mobile web beta.
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt updated the task description. (Show Details)
dr0ptp4kt updated the task description. (Show Details)

@dr0ptp4kt: This is the best candidate for the first of our 2015-16 Q3 goals.

… I think we might have to update the task, or create a new one, so that it's appropriately scoped.

What's the overlap between this and T113066?

Jdlrobson renamed this task from [GOAL] Make Wikipedia more accessible to 2G connections with fast API-driven web experience in mobile web beta to [GOAL] Make Wikipedia more accessible to all 2G connections with new fast API-driven web experience in mobile web beta.Feb 5 2016, 12:01 AM
Jdlrobson renamed this task from [GOAL] Make Wikipedia more accessible to all 2G connections with new fast API-driven web experience in mobile web beta to [GOAL] Make Wikipedia more accessible to all connections with new fast API-driven web experience in mobile web beta.

Is this the same as T111588 ?

No, it was the Q3 goal task (you had requested it be created 😄), before we actually met at the summit and before the pre-existing Q2 task T113066: [GOAL] Make Wikipedia more accessible to 2G connections was renamed to what it is called today and re-purposed for Q3.

I had a TODO to copy over still applicable pieces from this task into T113066: [GOAL] Make Wikipedia more accessible to 2G connections, but haven't done so yet.

Okay, I see. I've renamed the other task to not say "Goal" for the time being until you've copied the relevant parts across from here (at which point decline this and add goal into the other task). It was getting confusing! :)

Okay, about to mark this as a duplicate.