In order to support the lazy loading of references, we need to develop a heuristic for determining if references should be removed from the content of the page
To ensure that the intent of the author is not changed, we should only remove references that are not "part of the content". Here is a first pass at a heuristic:
- If the reference list is in the last section (or last n sections) of a page, it should be removed. The entire section including the header should be removed as well.
- Do not remove reference lists within info boxes
- Although implicit, don't remove reference lists embedded elsewhere in page (i.e. not in the last n sections of the page).
This should ensure that references that are "part of the content" remain so. It also allows us to remove large lists from the end of a page in order to deliver it performantly and to be able to construct a custom UI on the client.
NOTE: It should be noted that even if a reference list is NOT removed, it will still be returned in the JSON references API. So there is some minor duplication. However, this should not adversely affect the size of the data as references lists within the content (not in a reference section at the end of a page) are typically very small in size.
The primary goal of lazy loading references is to remove the rather large lists that occur at the end of a page.
From parent ticket:
This is for the next generation of MCS/PCS (Page Content Service) page content.
The goal of this is to reduce the payload for references by replacing the children of the <ol> tags that wrap references with a placeholder element (<div>?). To look for the reference <ol> tags we can use ol[typeof='mw:Extension/references'].
Clients then can later request the reference list information of the same revision through another endpoint (e.g. /page/references/:title/:rev) and merge the content into each placeholder.
The placeholder element needs:
- a class to signify that it is a placeholder for references, so that clients could add onclick handlers if desired (class=mw-references-placeholder) and
- an identifier of the references list so that clients can find the right place in the DOM to replace the placeholder with the original <ol> tag (to get it in the original state). A client may also chose to add a new onclick handler on the <ol> tag for collapsing/opening the references list if desired.
Background
- Web did experiments in this area - https://www.mediawiki.org/wiki/Reading/Web/Projects/Performance/Lazy_loading_references