Page MenuHomePhabricator

Provide featureful Local Storage abstraction in MediaWiki core
Open, Stalled, NormalPublic

Description

Core has a minimal api for LocalStorage. CentralNotice has a more elaborate system that suits its needs.

Current guidelines suggest using LocalStorage instead of cookies, and the majority of the browsers for which we support JS have LocalStorage.

Do we want something more elaborate in core? Should we re-use some of the CentralNotice code? Use an external library? Or maybe write something new?


In the meantime, jquery.jStorage has TTL (time to live) eviction. (Remember the eviction will only run at times the library is loaded.) (It is marked as deprecated, but mediawiki.storage does not have eviction.)

Event Timeline

AndyRussG raised the priority of this task from to Needs Triage.
AndyRussG updated the task description. (Show Details)
AndyRussG moved this task to Backlog on the Front-end-Standards-Group board.
AndyRussG added a subscriber: AndyRussG.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptDec 16 2015, 4:14 PM
Krinkle added a subscriber: Krinkle.Mar 9 2016, 7:19 PM

The performance guidelines says to use localStorage where possible instead of cookies for data not relevant to the server. However actual adoption of that has mostly been blocked organisation wide on T66721: mw.loader.store should not occupy all of localStorage.

The main issue with plain localStorage is the lack of an eviction policy. I would recommend the first iteration focus on that as the sole feature and implement it in a simple and stable fashion using as little overhead as possible.

Things to consider:

  • Race conditions between multiple open tabs. localStorage is shared. Fundraising ran into this with CentralNotice and has already documented and mitigated it to suite their needs. Good starting point.
  • Reads and writes are synchronous I/O.
    • Defer when possible. requestIdleCallback can be used. - https://w3c.github.io/requestidlecallback/
    • Consider using the deadline argument to ensure we don't do block the browser's main thread for too long (jank, scroll/select breakage etc.). Allows dynamically spread over multiple iterations as needed based on device performance.

Having looked at similar implementations in larger libraries, I think the current implementation CentralNotice is pretty straight forward. I think we can simplify it a bit more, but it's a good starting point.

In terms of storing in localStorage it would be best to use the mediawiki.storage module and wrap the mw.storage API to handle your localStorage work. This acknowledges that localStorage is shared and allows us to explore making that library better and adapting it to your use cases.

Change 254326 had a related patch set uploaded (by Krinkle):
Migrate doIdleWork() to standardised requestIdleCallback

https://gerrit.wikimedia.org/r/254326

I'm working with Fundraising to align their kvStore with the requirements for this task.

See also T108849: Move CentralNotice stuff out of cookies and work-in-progress at https://gerrit.wikimedia.org/r/254326.

Once that settles, it'll effectively be a canary customer of the interface that we can then merge into mediawiki.storage. For the public API the only change will be the addition of an expiry parameter. And, akin to mediawiki.cookie, we'll want to provide a sensible default as well as support for sessionStorage in addition to localStorage for values that are only relevant to the session.

See also T110353#2147151 and https://gerrit.wikimedia.org/r/256115.

Krinkle claimed this task.Mar 24 2016, 3:54 AM
Krinkle added a project: Performance-Team.
Krinkle set Security to None.
Krinkle moved this task from Inbox to This Quarter (FY1920Q1 Jul-Sep) on the Performance-Team board.
Krinkle triaged this task as Normal priority.Mar 24 2016, 5:03 AM

Change 254326 merged by jenkins-bot:
kvStoreMaintenance: Refactor to use requestIdleCallback

https://gerrit.wikimedia.org/r/254326

Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptApr 19 2016, 7:46 PM

Next steps here is integrating this back into MediaWiki core as part of the mediawiki.storage module. I propose to let the new implementation of kvStoreMaintenance CentralNotice simmer for a few deployment cycles and in a month or so pick this up again.

Krinkle renamed this task from Study options and libraries for a more featureful LocalStorage API in core to Provide featureful Local Storage abstraction in MediaWiki core.Apr 21 2016, 7:06 PM
Krinkle removed a project: Patch-For-Review.
Krinkle moved this task from Doing to Backlog: Small & Maintenance on the Performance-Team board.

Change 323344 had a related patch set uploaded (by Krinkle):
mediawiki.storage: Provide a wrapper for sessionStorage too

https://gerrit.wikimedia.org/r/323344

Change 323344 merged by jenkins-bot:
mediawiki.storage: Provide a wrapper for sessionStorage too

https://gerrit.wikimedia.org/r/323344

Krinkle closed this task as Resolved.Dec 2 2016, 3:16 AM
Krinkle removed a project: Patch-For-Review.
Krinkle added a project: MediaWiki-General.
Krinkle reopened this task as Open.Jan 18 2017, 3:57 AM

Re-opening since this was merely a SessionStorage wrapper. An expiry/ttl mechanism (which is the minimal required feature) is still missing.

Change 340185 had a related patch set uploaded (by Krinkle; owner: Krinkle):
kvStore: Minor clean up and optimization of kvStoreMaintenance

https://gerrit.wikimedia.org/r/340185

Change 340185 merged by jenkins-bot:
kvStore: Minor clean up and optimization of kvStoreMaintenance

https://gerrit.wikimedia.org/r/340185

Nirmos added a subscriber: Nirmos.Feb 28 2017, 9:23 AM

Have you considered bringing the expiry issue up at standards (specification) level? What are the arguments (if any) that prevents this from being in the standard?

Krinkle added a comment.EditedMar 1 2017, 3:11 AM

@Nirmos Yep. I've brought this to various whatwg members a year or 2 ago.

https://wiki.whatwg.org/wiki/Storage details a summary of the various APIs. https://storage.spec.whatwg.org/ is a draft that would make these different storage APIs collaborate better together (e.g. have a common interface for quotas and persistence/non-persistence). However it unfortunately does not yet deal with time-based expiration or any other eviction mechanism for that matter.

https://github.com/whatwg/storage/issues/11 is an open issue asking for an expiry mechanism.

The most recent storage interface added to the web is Cache and CacheStorage interfaces part of theService Worker spec. https://github.com/w3c/ServiceWorker/issues/863 is an issue for adding a native eviction support, there.

In MediaWiki core, we currently don't have any expiry for the mw.storage interface. We set values without any expiry. And when keys get removed from the system, or if a feature is removed from the platform, or disabled by the user. We either forget about it, or if there's a reasonable place to put it, we'll add small chunk of code that, via requestIdleCallback, checks if the key exists, and removes it. - On every page view. This doesn't scale.

In CentralNotice, there's a custom kvStore interface that needed generic eviction logic because its keys contain identifiers for user-generated content (in this case, CentralNotice tracks which state about page banners that are in rotation, and these have unique IDs). The code is at ext.centralNotice.kvStoreMaintenance.js. Main change was https://gerrit.wikimedia.org/r/254326 (4a01c97129).

  • Starts via requestIdleCallback.
  • Loop through localStorage.length, using localStorage.key and collect relevant keys.
  • Yield, and take another requestIdleCallback to process the keys. This time a recursive callback that reschedules itself until all keys have been processed.

There's a TODO to trigger this logic only on a random sampling of page views, and/or add some locking mechanism via sessionStorage to avoid triggering it from multiple windows simultaneously.

While this seems to be working fine, I'm not happy with the performance numbers for this. Especially on mobile, where unlike desktop browsers, devices typically do not read the entire store into memory after the first key access. So it's repeated disk I/O. Not sure whether this is a browser thing or an OS thing, could be that desktop just has a more effective disk cache due to having more spare memory. Either way, the end result is the same.

Another problem is that, because we can only evict keys when a page is open, space is inefficiently used. I'm sure that when absolutely needed, browsers will evict all stores of an origin to save space, but it would be preferable for keys that are known to be expired to go first, thus leaving more room for higher priority data, and increasing the chances that on a repeat view performance-sensitive values are still cached. Or that that localStorage-saved draft comment the user forgot about didn't get randomly get deleted when some other site requires a lot of space.

This is also relevant to T66721, which is about our JS/CSS module cache in localStorage often not working well in Firefox. Firefox has a shared quota for all subdomains. This means that when our more active users that participate on multiple wikis (Wikimedia hosts over 800 wikis, including 284 active Wikipedia subdomains), you visit a wiki, even once, that module cache is populated, but until you visit that wiki again, we have no way to clear it. The modules may have changed many times since then, and would be pruned immediately when viewed, but still it's taking up precious space and preventing other wikis from claiming space instead (in Firefox). This is somewhat orthogonal, but useful to keep in mind.

Cookies, the oldest value store on the platform (as far as I know), has a key eviction mechanism. But localStorage has neither TTL nor LRU. IndexedDB, WebSQL and the new SW Cache API, also don't have either.

A few random thoughts: When a user logs out, we terminate their "session" from a server perspective. Authentication cookies are also cleared client-side. However, unlike session cookies, client-side sessionStorage relates to the larger browser session. Perhaps users are expected to be wise enough to close browsers on shared computers, but it might make sense to clear sessionStorage by default from our Log out page. Not just for users on public computers, but also to avoid our own "anonymous" event logging from spanning across a logged-in session and logged-out session.

In addition to sessionStorage, we may also want to think about localStorage. Is there a use case for storing personal information in localStorage, and if so, what is our eviction strategy? For example, other sites often store draft-like content in localStorage so that it may be persisted across browser crashes and other unexpected restarts (unlike sessionStorage). However when a user explicitly chooses to log out, they might not expect that to persist. (But then again, what if the log-out happened by accident?) On the other hand, localStorage is already cleared automatically by the same system that clears cookies (e.g. "Clear all cookies", or "Incognito" mode).

Krinkle removed Krinkle as the assignee of this task.Jun 7 2017, 7:47 PM
Krinkle moved this task from Backlog: Small & Maintenance to Radar on the Performance-Team board.

A few random thoughts: When a user logs out, we terminate their "session" from a server perspective. Authentication cookies are also cleared client-side. However, unlike session cookies, client-side sessionStorage relates to the larger browser session. Perhaps users are expected to be wise enough to close browsers on shared computers, but it might make sense to clear sessionStorage by default from our Log out page.

Completely agree with clearing sessionStorage on logout. A lot of users still don't know to close their browsers (or understand what happens when they do).

In the meantime, there is jquery.jStorage. This is marked as deprecated in favor of mediawiki.storage.

However, jStorage has TTL, which mediawiki.storage does not. In my opinion, jStorage should be removed until mediawiki.storage offers some kind of automatic eviction (e.g. TTL or LRU).

Of course, jStorage's eviction will only run if the library is loaded (since its TTL handler needs to run), but that's better than nothing.

Krinkle changed the task status from Open to Stalled.Jun 6 2019, 9:06 AM

Marking as stalled, blocked on browsers having a storage API that

  1. Has support for expiry.
  2. Doesn't have its values send over the network (like cookies).

Without this, our only options for expiring values are unacceptable from a performance perspective.

Ltrlg added a subscriber: Ltrlg.Jul 4 2019, 12:02 PM