In mobile devices references and notes can account for 50% of the HTML of an article (see https://www.mediawiki.org/wiki/Reading/Web/Projects/A_frontend_powered_by_Parsoid/HTML_content_research#HTML_size_report), mobile intend to scrub references from the initial output and lazy load them (with suitable non-JS fallbacks)
Given an article's references are not needed straight away it should be possible to obtain them via an API separately from the rest of the content and render this functionality via JavaScript.
The references extension does not store an intermediate structure for references anywhere it just outputs HTML. Building an API to surface an intermediate structure (e.g. JSON representation) of references would require additional storage. In the worse case scenario references account for around 50% of HTML [1] but an intermediate structure is likely to be a lot smaller (@phuedx to investigate this)
[] How much would storing an intermediate JSON structure require?
[] Where should this data be stored?
[1] http://chimeces.com/loot-content-analysis/