Page MenuHomePhabricator

Performance review of Extension:NearbyPages
Closed, ResolvedPublic

Description

Description

The Special:Nearby page has historically been part of Extension:MobileFrontend. This is not ideal for 3rd parties wanting to use Nearby without a mobile site. The feature has therefore been rebuilt using Vue.js and WVUI as Extension:NearbyPages (https://www.mediawiki.org/wiki/Extension:NearbyPages)

In the process, we've also improved support for Wikidata.org which currently has various bugs in the existing MobileFrontend version.

This also allows us to avoid running Extension:Nearby on Wikimedia projects where it doesn't make sense (e.g. no location information).

The extension provides the same functionality: the Special:Nearby page

The code is scoped to a single special page.

Preview environment

(Insert one or more links to where the feature can be tested, e.g. on Beta Cluster.)

This will be available on the beta cluster on the 1st September
In the mean time, the experience can be previewed at https://wikidata-nearby.netlify.app/ and https://wikipedia-nearby.netlify.app/. Please note the "random" feature will not be enabled in production.

Which code to review

https://github.com/wikimedia/mediawiki-extensions-NearbyPages/tree/master/resources
https://github.com/wikimedia/mediawiki-extensions-NearbyPages/blob/master/extension.json
https://github.com/wikimedia/mediawiki-extensions-NearbyPages/blob/master/includes/SpecialNearby.php

Performance assessment

Please initiate the performance assessment by answering the below:

  • What work has been done to ensure the best possible performance of the feature?

The code is limited to the Special:Nearby page.
The code follows the current guidelines for using Vue.js in ResourceLoader
The code mirrors the existing code in MobileFrontend, using the majority of the same code that has been in production for over 7 years.
The API inside Extension:Geo has limits relating to search radius.
The MobileFrontend version remains intact while we roll out the feature to production wikis in case we need to roll back at any time.

  • What are likely to be the weak areas (e.g. bottlenecks) of the code in terms of performance?

Usage of Vue.js and WVUI is still new.
Given the rest of the code is a carbon copy of MobileFrontend it shouldn't be a worry.

  • Are there potential optimisations that haven't been performed yet?

In future when SSR for Vue.js is available, it might be possible to explore server-side rendering for commonly used pages.

  • Please list which performance measurements are in place for the feature and/or what you've measured ad-hoc so far. If you are unsure what to measure, ask the Performance Team for advice: performance-team@wikimedia.org.

I don't think there is anything specific here that will be useful to measure but please let me know if that's not the case.

Event Timeline

Jdlrobson triaged this task as Medium priority.Sep 2 2021, 7:26 PM

We triaged this task as a future goal (meaning: next quarter, not an interrupt task for the current quarter) per our timeline.

This will likely be assigned to Peter and/or myself to take place in October.

From a quick read through, it looks like this might be providing useful information to people on our platform in a way that is solely accessible in a Grade A browser, when online, and only if and after the JS payload has successfully arrived and executed client-side. If I understand this correctly, I believe this would be unprecedented for a planned and resourced software aimed at production. Our frontend is architected as a layered approach, with Modern (Grade A) building atop of Basic (Grade C). There is no such thing as Modern-only access to information on our platform. (Without the "planned and resourced" caveat, I'd of course have to mention the one exception to date, which is the Graph extension temporarily falling apart in a way that violates this, explained at T285890#7289104).

If correctly understood, I would in that case recommend per foundation principles and arch principles, that the basic submit/response cycle also be accessible in some way, regardless of visual appeal.

And while we have not yet measured the performance, it is likely that "time to visual completion" would in that case be significantly improved if this fallback utilized the same stylesheet. This woud be consistent with all existing special pages that provide access to information, both all the OOUI-powered pages, as well the the only Vue-powered special page of that kind (Special:MediaSearch).

If I understand this correctly, I believe this would be unprecedented for a planned and resourced software aimed at production.

This is a swap in replacement for the existing page [1] which has not provided a grade C experience since its inception in May 2013 [2] so I don't think this is unprecedented. It is inaccessible from grade C mobile devices and useless without geolocation API support as the whole point of the feature is to show you articles near your current location. This is not a new feature, but a rewrite of an existing feature. Let me know if you have any questions (October timeline sounds great).

[1] http://en.m.wikipedia.org/wiki/Special:Nearby
[2] https://diff.wikimedia.org/2013/05/29/wikipedia-nearby-beta/

This will likely be assigned to Peter and/or myself to take place in October.

Is this timeline still realistic or has this been delayed?

Thanks, yes, this is assigned to me for this quarter. We had a late start due to some cross-team iniatiatives from Q1 taking a few weeks longer than originally planned. I expect this will still get done within the quarter however, and will try to squeeze it in soon.

Hi, @Krinkle any updates around the timeframe for this one?

It is […] useless without geolocation API support as the whole point of the feature is to show you articles near your current location. […] since its inception in May 2013.

The community has made many tools for location-based information over the past twenty years, both before and after Nearby. Those tend to have good ecosystem integration, more capabilities, and while some do offer Geolocation API input as input option - none required it. An input field can take human-readable descriptions to translate to coords based on Wikipedia/Wikidata titles or Nominatim API. This is how OSM, Apple/Google Maps, etc work. These wouldn't be that great if they produced static results with no input field or draggable map.

Some user stories our audience might expect by now:

  • Back home from somewhere earlier today and have something to read or contribute, what good articles were nearby?
  • Reading Wikipedia about an interesting place, any featured articles nearby?
  • Going to a new place tomorrow, what's near there? For example, WikiMap uses Nominatim to accept user input like streets and other human-readable description (transparently become coords for geosearch).

I understand the choice you made, and it's a reasonable one given a focus on mobile and new users. All I'll say is that it is a choice, not an inevitability. I was merely surprised to see the same choice made a second time after seven years, with no other input methods or fallbacks.

Performance numbers

I evaluated the Beta Cluster copy on mobile and desktop, comparing it to BlankPage as baseline, and to the current production version. For ease of testing I used a hash fragment to set a fixed location.

I like how it supports permalinks that can be shared, i.e. the receiver will not initialise their own location but view the sender's results. That opens the path for future on-wiki integration. Nice!

HTML+CSS+JS (± baseline)Total page weight
Production, BlankPage169 kB239 kB
Production, BlankPage, mobile309 kB333 kB
Production, Nearby, mobile311 kB (+2 kB)
Beta, BlankPage, mobile270 kB
Beta, Nearby, mobile368 kB (+98 kB)
Beta, BlankPage, desktop253 kB
Beta, Nearby, desktop351 kB (+98 kB)

Reviewing Page load performance metrics: Visual completion, PLT, and responsiveness.

Given the client-side nature, I'll use responsiveness to tell the story because we measure that through long tasks, and those naturally reflect time to visual completion here. Long tasks observed in Chrome devtools on high-end MacBook Pro, 3G Fast, 4x CPU throttle, empty cache and storage.

Nearby (Beta, mobile): 7 slow tasks, 1.3s of combined unresponsiveness.

  • Task 1 (185ms): Initial exec of page modules: 24ms to define vue.js, of which 11ms JS compile.
  • Task 2 (60ms): Initial exec of wvui (after style insert): 23ms to define wvui.js, of which 15ms compile.
  • Task 3 (633ms): Initial exec of mobile.startup+ext.nearby.scripts (after dependencies): 201s JS compile. 330ms ext.nearby sync execution for Vue.mount until async yield.
  • Task 4 (71ms): data-bridge.init (deferred)
  • Task 5 (65ms): skins.minerva.scripts (deferred)
  • Task 6 (100ms): Second round of ext.nearby. 80ms in toCard and getDistanceMessage to prepare Vue job.
  • Task 7 (137ms): Third round of ext.nearby: 120ms in Vue rendering (deferred).

capture.png (1×2 px, 196 KB)

Baseline

The mobile baseline is notably heavier than the desktop baseline at +94kB (about the same size diff as in 2013). I'm not counting that as part of Nearby. I don't normally analyse the BlankPage baseline in detail, but given it's been a few quarters since I did, I decided to inspect its long tasks as well (and will create/ping other tasks about these while at it). Feel free to ignore this.

Baseline (Beta, BlankPage, desktop): 2 slow tasks, 0.25s of combined unresponsive time.

  • Task 1 (180ms): Initial execution of page modules. 90ms is compiling (native). 12ms vector-legacy-js, 10ms WikimediaEvents, 14ms mediawiki.base, 8ms ext.centralNotice.geoIP (parseCookieValue), 7ms mediawiki.util.
  • Task 2 (69ms): Initial exec for data-bridge.init.js

Baseline (Beta, BlankPage, mobile): 4 slow tasks, 0.39s of combined unresponsive time.

  • Task 1 (181ms): Initial exec of page modules. 103ms JS compiling. Most time spent during non-deferred execution: 15ms WikimediaEvents, 14ms mediawiki.base, 8ms ext.centralNotice.geoIP (parseCookieValue), 7ms mediawiki.util.
  • Task 2 (80ms): Initial exec for mobile.startup JS (deferred due to style dependency). 43ms compiler (native). 26ms in mobile.startup/SearchResultsView eagerly rendering an unused spinner.
  • Task 3 (73ms): Initial exec for data-bridge.init.js.
  • Task 4 (61ms): Initial exec of mobile.init+skins.minerva.scripts (deferred). 30ms compiler. 25ms eagerly initialising mobile.startup/currentPage. 5ms eager DOM work for initMediaViewer. 5ms for page-issues.
Nearby performance budget

The new Nearby app seems medium in transfer size at +99kB. It's not small, but not huge either. The old code was about 2kB, which I assume is the size of the essential logic for device location and backend queries. For comparison, the new 99kB app produces 0.7kB of Vue DOM (including 1 result card, a subset can repeat as needed), and one of the Nearby thumbnails is ~5 kB. The new app can be thought of as costing about 100x the optimum cost of Nearby results or 20 thumbnail images.

There is no formal latency budget for this extension that I'm aware of. I'll evaluate it using RAIL guidelines to get a ballpark estimate.

For the cold load, the guideline sets 1 second. ~200ms is spent in JS compiling (Note: this is native compiling of JS code, not Vue compiling), plus ~330ms in executing JS code during the first phase after which one button is visible and functional. With some added time for network and CSS, this adds to ~1.0s. Good for loading time.

For the result request (button click), the guideline sets 0.1s to acknowledge the interaction, and 1 second for the result to appear. About 700ms of that is spent on the network request to the API (measured in production since Beta is slow). Another 230ms is spent within local JS to make the API result visible. Plus some native style recalc and idle time in between, adds to ~1s. Great!

Urgent concerns

None.

Future potential

The geosearch API request is surprisingly slow at 700ms (production). I wonder if something broke or otherwise regressed there. Perhaps its DB query is lacking an index, or maybe a cache stopped working and its running expensive code in a tight loop?

The JS code taking 230ms to take local data and make it visible in the UI is suspect. I know Vue is competitive in benchmarks and this is a small template with but a few items to iterate. Perhaps we're using a suboptimal Vue option in Nearby and/or WVUI to make it take that slow?

Per RAIL, it is expected for a button click to be acknowledged in 100ms. As it currently takes ~1s for results to appear, it may be worth rendering an intermediary loading state while you wait for the API. I think this has been developed already, but it it seems to have a bug (see below).

One way to hide the latency in the future could be using quick events (T183720), which would make the button interactive for JS without blocking on JS.

See also:

  • Legacy search suggestions, also from api.php, takes 10ms from CDN or 120ms without CDN (vs 700ms).
  • Page Previews (Popups) renders its card in ~20ms with the same CPU profile (vs 230ms).
  • Mustache.js + Nearby template weighs 4kB (vs 99kB).
  • Mustache renders this template in 5ms (vs 230ms).
Bugs
  • During the loading state, the cards are clickable and lead to /wiki/Undefined. That seems unintended.
  • During the loading state, the cards look visually clickable. Perhaps they could be dimmed or pulsing. This shows the limitation of the Typeahead component I suspect, since those are modal and don't expect to render without content. There may be a different loading component available (e.g. spinner or bar), although I like what you have now as it looks visually almost complete, and feels fast. If not already, a loading state might make a good TypeaheadSuggestion feature request.

The article URLs are generated incorrectly, e.g. derived from page titles with spaces appearing as %20 instead of the official URLs. This causes:

  • Delay from appserver due to likely CDN cache miss.
  • Possibly a 404 Not Found on non-English wikis due to incorrect encoding.
  • Possibly stale content due to being a non-canonical URL and thus missing purges.
  • Full extra network roundtrip due to redirect.

I see that the geosearch API query already includes other metadata. There should be a pageinfo/url property there that you can leverage.

Browser history:

  • Start with permalink, or history entry, or press Show nearby in a different location first (example).
  • Press "Show nearby pages" to use your now-current location.
  • Observe that the back button doesn't work. It seems to have overwritten the current history entry instead of pushing a new one.

I saw in the code that the hashchange/popstate handlers are all correctly in place for this already, but they're currently unreachable. There is one replaceState call somewhere instead of pushState (for all but the first click from an empty state).

Thanks for the detailed report and bugs report. I will open tickets for the issues identified here.

Thanks for all this. I've opened up some tasks. A few questions for you at the end.

The geosearch API request is surprisingly slow at 700ms (production).

Opened T300590. Not a blocker.

During the loading state, the cards are clickable and lead to /wiki/Undefined. That seems unintended.

Created T300588. Blocker for deployment.

During the loading state, the cards look visually clickable. Perhaps they could be dimmed or pulsing. This shows the limitation of the Typeahead component I suspect, since those are modal and don't expect to render without content. There may be a different loading component available (e.g. spinner or bar), although I like what you have now as it looks visually almost complete, and feels fast. If not already, a loading state might make a good TypeaheadSuggestion feature request.

Opened T300589. Not a blocker.

The article URLs are generated incorrectly, e.g. derived from page titles with spaces appearing as %20 instead of the official URLs.

Opened T300593. Blocker.

Browser history:

Created T300594. Not a blocker.

Questions

The JS code taking 230ms to take local data and make it visible in the UI is suspect.

What local data are you talking about here? The rendering of the skeleton state?

I know Vue is competitive in benchmarks and this is a small template with but a few items to iterate. Perhaps we're using a suboptimal Vue option in Nearby and/or WVUI to make it take that slow?

Not that I'm aware of. Did this performance review happen before or after the Vue 2 to Vue 3 switch? If before it might be worth re-running now we have Vue 3 to see if things improved.

Per RAIL, it is expected for a button click to be acknowledged in 100ms. As it currently takes ~1s for results to appear, it may be worth rendering an intermediary loading state while you wait for the API. I think this has been developed already, but it it seems to have a bug (see below).

Could you clarify this comment? The intermediate loading state in this situation is the skeleton design pattern described in T300589. Is the bug that:

  1. it's not obvious to you that this is an intermediate state
  2. the URL issue you have pointed out
  3. or something else?

@Krinkle could you pleaseanswer my questions above? Just want to make sure I've captured all the required follow-up work here. Thanks in advance.

Closing out. @Krinkle feel free to reply later if there's any actionables in the above. I'll be monitoring this ticket.

[…]

The JS code taking 230ms to take local data and make it visible in the UI is suspect.

What local data are you talking about here? The rendering of the skeleton state?

When the API response is done, and we have the data locally "in our hands" (in memory, parsed, available to JavaScript), it took another 230ms for this information to be on-screen and visible to the user, purely spent in JavaScript/DOM doing things. This compared to Mustache which took about ~5ms for everything (parse HTML template, and render it with parameters to its final string, and parse into DOM, and attach for on-screen display).

I know Vue is competitive in benchmarks and this is a small template with but a few items to iterate. Perhaps we're using a suboptimal Vue option in Nearby and/or WVUI to make it take that slow?

Not that I'm aware of. Did this performance review happen before or after the Vue 2 to Vue 3 switch? If before it might be worth re-running now we have Vue 3 to see if things improved.

As I understand it, change 666434 upgraded Vue 2 to a special build of Vue 3 that is compatible with Vue 2 apps. This was merged and deployed back in December. I believe we are still on that same build today and have not yet transitioned to pure Vue 3.

Per RAIL, it is expected for a button click to be acknowledged in 100ms. As it currently takes ~1s for results to appear, it may be worth rendering an intermediary loading state while you wait for the API. I think this has been developed already, but it it seems to have a bug (see below).

Could you clarify this comment? The intermediate loading state in this situation is the skeleton design pattern described in T300589. Is the bug that:

  1. it's not obvious to you that this is an intermediate state
  2. the URL issue you have pointed out
  3. or something else?

The first. Due to the skeleton appearing visually the same as the end result, and it having active but malfunctioning URLs, I thought maybe it was an unintentional side-effect of two asynchronous callstacks competing and e.g. a react-style propagation happening too early with no data available and that perhaps the underlying WVUI widget doesn't actually feature a skeleton state. This is all speculation based on how it appeared visually and how it behaved in the call-stack.

I believe based on your response that it was intended as a skeleton, but that it is not yet working correctly, and could perhaps use a design tweak to look and be less interactable.