Page MenuHomePhabricator

[GOAL] Lazy load references in mobile skin
Open, NormalPublic

Description

In the mobile site, the references widget parses the HTML of the page to generate a structured data object representing all references in the page.

When a reference is clicked it shows a "ReferencesDrawer" widget showing the user the reference in place.

Given the references HTML accounts for 50% of the total page HTML on many of our pages and very few people view references on mobile and the fact that this would save our users many terabytes of data in a year, we should not serve this by default and instead route requests for references to a special page and use the API when JavaScript is enabled to retrieve references when needed.

Further reading:

Problems with existing implementation

Related Objects

StatusAssignedTask
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenCenarium
DeclinedNone
OpenNone
Resolveddr0ptp4kt
DuplicateJhernandez
Duplicatedr0ptp4kt
OpenNone
ResolvedJdlrobson
OpenNone
ResolvedJdlrobson
Resolvedphuedx
Resolvedphuedx
OpenNone
OpenCenarium
OpenNone
ResolvedJdlrobson
DeclinedNone
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedJdlrobson
ResolvedNone
DeclinedNone
DuplicateNone
DuplicateNone
Resolveddr0ptp4kt
DuplicateNone
ResolvedJdlrobson
ResolvedJdlrobson
StalledNone
DuplicateNone
Resolved Peter
Resolved Peter
OpenNone
OpenNone
StalledNone
OpenNone
DuplicateJdlrobson
StalledNone
OpenNone
ResolvedJdlrobson
Resolvedphuedx
ResolvedJdlrobson
ResolvedJhernandez
ResolvedJdlrobson
ResolvedJdlrobson

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Jdlrobson renamed this task from Lazy load references in mobile skin to [GOAL] Lazy load references in mobile skin.May 17 2016, 4:54 PM

Have we discussed how this can impact searching for content (Google)? If I understand correctly the references is moved away to another page. That page will not be indexed ("noindex,nofollow") or will we let it be indexed?

The content is loaded on demand when the user clicks the link, so Google and other search engines will not do that. I'm thinking worst case scenario we remove the possibility to get hits when people search on one of the references. And that is bad. I've tested it on mobile and we surely get hits on the current references.

@Jdlrobson Should we change Special:Citations to be indexed/follow as @Peter.Hedenskog says?

I've check we don't have a robots directive at least on beta for those, so it will work fine.

But if you hit the Citations page there's no way to come back to the article.

Also think we should highlight that we change the behavior (one extra click) if you do a search and end up on the Citations page (I don't say it's not worth it, only that we should make the visible if we haven't discussed it before).

Tbayer added a comment.EditedMay 19 2016, 7:02 PM

Agree that removing the citations from search engine indices could cause huge problems. But perhaps this is mitigated by the fact that rel=canonical still points to the desktop version?

But if you hit the Citations page there's no way to come back to the article.
Also think we should highlight that we change the behavior (one extra click) if you do a search and end up on the Citations page (I don't say it's not worth it, only that we should make the visible if we haven't discussed it before).

This is true but I suspect most users will hit back button. A simply "return to page" link under the title should suffice. Can you create a card. I don't think we should be indexing this.

Agree that removing the citations from search engine indices could cause huge problems. But perhaps this is mitigated by the fact that rel=canonical still points to the desktop version?

SEO is a little mystery but as I understand it the desktop is the canonical uri and what gets indexed and the mobile site is only used for optimising. JavaScript is run on indexed pages in some form. If someone is seriously concerned and has a Google contact it would be great to clarify that rather than guess.

dr0ptp4kt added a subscriber: dr0ptp4kt.

Removing sprint 73 tag as not all subtasks are in the current sprint. This is just protocol. I created a separate placeholder goal task.

Jdlrobson lowered the priority of this task from High to Normal.Apr 20 2017, 11:12 PM

Dropping priority to reflect current situation.

Jdlrobson updated the task description. (Show Details)Jul 13 2017, 4:36 PM
dr0ptp4kt moved this task from Backlog to Feature on the Reading-Admin board.Jul 20 2017, 9:46 PM

What's the status of this goal? Stalled?

I found this task when searching for goal-related Phab tasks 🙂

See Problems with existing implementation and T146396
The existing implementation is a little slow on large articles for the first view, but fine on the majority. When the new REST base references endpoint is available we will be able to speed this up significantly.

We rolled it out to the entire of Russian Wikipedia for a month and were aware of no issues with that, so really it's a question of whether to promote this feature from beta to stable or remove the code.

The advantages of doing so are bytes savings on all articles (by up to 50% of HTML) but given HTML is smaller than images they are less significant than the lazy loaded images initiative (which got all the attention).

Product decision from @ovasileva is needed. We should do some work either way - removing the code or promoting it.

The advantages of doing so are bytes savings on all articles (by up to 50% of HTML) but given HTML is smaller than images they are less significant than the lazy loaded images initiative (which got all the attention).

Removing HTML is not only about size. A huge DOM tree takes time to parse, layout and paint and that definitely has a toll on less powerful devices.

We didn't measure when researching the performance impact of having a reduced DOM tree, that would be very interesting to know to have additional reasons to push for this feature.

Removing HTML is not only about size. A huge DOM tree takes time to parse, layout and paint and that definitely has a toll on less powerful devices.

Right and I think this is valuable work. I just wanted to flag that it was pushed down the priority list because lazy loaded images was seen as more significant a change. TBH even with the slow API calls I think this is any improvement and should be shipped.

bearND added a comment.Jan 5 2018, 6:32 PM

We're trying to come up with a JSON API for references. You may want to check out https://www.mediawiki.org/wiki/Page_Content_Service/References and leave feedback at T170690.

Side-note, I'd be very interested in the testing setup for DOM tree performance comparison, as I could need this elsewhere myself.

It is a hard topic to check, I'm guessing real device testing would be needed, as the devtools can only go so far with the network and CPU throttling (no memory constraints, and other real device troubles).

A quick (inaccurate, anecdotal) comparison shows nothing big in terms of performance in the Chrome Devtools (slow 3g, 6x slowdown, after 1 refresh), but if you look closer you can see some differences (see nodes and listeners for example):

  • Barack Obama en.wiki
  • Barack Obama en.wiki (beta, with lazy references)
modenodeslistenersMemory
stable22K-45K3k-6K229MB
beta7K-15K20-1.5K217MB

So there are big differences when not loading the references DOM, but we haven't found a good way to measure the impact on real devices. I'd also be very interested to hear how to test the DOM performance, as I'm not sure how we would do it.

My guess would be something like, save static HTMLs (with & without references), serve them locally, access the page with a real device, with the chrome devtools connected to the computer chrome to record. Do a bunch of runs and record them, save the HARs and get the averages and deviation. Repeat for a few URLs.

Very manual, it would be great if we can find a better way to do this kind of stuff.

@Volker_E FYI @Peter has been doing on-device tests at T184527: Test performance win with lazy references on a real mobile phone, and has also improved docs, see T184681: Document how to run performance tests on real phones. He has documented his steps and methods up until now, in case you are interested 👍.

given https://bugs.chromium.org/p/chromium/issues/detail?id=849106 I'm wondering if it would be useful to finish up this work now.

T193221 is our own task for the Chrome upstream bug.

I think like this: This is a generic Chrome problem so it affect all sites on H2 and slow connections where you have a render blocking CSS and it seems the bug gets quite a lot attention from the Chrome developers. So I think we should sit tight for a while and see, maybe they have a way of fixing it. If there's no way for them, then we can take on lazy loading and/or think about other solutions?

Sounds good to me!

Change 506594 had a related patch set uploaded (by Jdlrobson; owner: Jdlrobson):
[mediawiki/extensions/MobileFrontend@master] Remove the lazy load references beta feature

https://gerrit.wikimedia.org/r/506594