Page MenuHomePhabricator

Audit use of cookies
Closed, ResolvedPublic

Assigned To
Authored By
ori
Aug 26 2015, 4:31 PM
Referenced Files
F22656863: capture.png
Jun 26 2018, 1:24 AM
Tokens
"Like" token, awarded by xSavitar."Hungry Hippo" token, awarded by Liuxinyu970226."Hungry Hippo" token, awarded by Steinsplitter.

Description

Cookies are deleterious to site performance because they hurt cache efficiency and because they bloat payload size. The bloat added by cookies is significant despite their relatively small byte size due to TCP slow-start.

We should go over all cookies and work to eliminate as many of them as possible. At the time of writing (26-Aug-2015), localStorage is available for 92.76% of traffic. MediaWiki extensions which currently rely on cookies could use localStorage instead, except when the adverse effect on the user would be extremely high (for instance, the site would be unusable for a user if fundraising CentralNotices were displayed unconditionally).

Documented at: https://www.mediawiki.org/wiki/Performance_guidelines#Cookies


Overview of cookies found on Wikimedia wikis as of 2017-02-27

All page loads (all wikis, all users, logged-in and logged-out)

NameSourcePurposeExpiryComment
CP (all-frontend)VarnishConnection properties (e.g. HTTP/1.1 vs HTTP/2)Session No longer used. Removed.
GeoIPVarnish (text-frontend)Geo-location for CentralNotice bannersSession Keep. Only known server-side, needed client-side.
WMF-Last-Access (all-frontend)VarnishAnalytics32 days Keep. Used server-side (HttpOnly).
WMF-Last-Access-Global (all-frontend)VarnishAnalytics32 days Keep. Used server-side (HttpOnly).

Most page loads (conditional, but possible on all wikis, all users)

NameSourcePurposeExpiryComment
<wiki-id>mwuser-sessionIdmediawiki.jsGeneric client-side session idSession Moved to sessionStorage. https://gerrit.wikimedia.org/r/340236 (1.29.0-wmf.15)
mediawikiwikiGeoFeaturesUser2WikimediaEvents JSUser token10 minutes Moved to sessionStorage. https://gerrit.wikimedia.org/r/340232 (1.29.0-wmf.14)
dismissSiteNoticeDismissableSiteNotice JSSeen state30 days Unsure..
centralnotice_hide_*CentralNotice JSSeen state7 days Moving to localStorage. T108849
centralnotice_hide_fundraisingCentralNotice JSSeen state250 days Moving to localStorage. T108849

Editing

NameSourcePurposeExpiryComment
centralauth_SessionCentralAuth PHPSULSession Keep. Needed server-side for central login (shared by multiple subdomains by CentralAuth; HttpOnly).
centralauth_TokenCentralAuth PHPSULConfigurable Unsure. Used server-side by CentralAuth. (HttpOnly)
centralauth_UserCentralAuth PHPSULConfigurable Unsure. Used server-side by CentralAuth. (HttpOnly)
forceHTTPSMediaWiki PHP30 days Unsure. Used by CentralAuth? (HttpOnly).
<wiki-id>SessionMediaWiki PHPLogin/SessionSession Keep. Needed server-side for user log-in and other session logic (MediaWiki core; HttpOnly).
<wiki-id>UserIDMediaWiki PHPnon-SUL LoginConfigurable Unsure. Used for "remember me"? (HtttpOnly)
<wiki-id>TokenMediaWiki PHPnon-SUL LoginConfigurable Unsure. Used for "remember me"? (HtttpOnly)
<wiki-id>UserNameMediaWiki PHPConfigurable Unsure. (HtttpOnly)
<wiki-id>templates-used-listMediaWiki JSCollapse/expand state30 days Moved to localStorage. https://gerrit.wikimedia.org/r/340243 (1.29.0-wmf.15)
VEEVisualEditor JS + PHPPreferred editor mode30 days Move to localStorage? Currently used server-side for logged-in users as well. Maybe be movable to user pref system. See T181933.
WARNING: Moving things to HTML5 sessionStorage (mw.storage.session) can be done freely. However take caution with moving things to localStorage as a proper expiry strategy is still being worked on. Avoid localStorage for the time being when dealing with variable keys. For 1 or 2 fixed keys, we can deal with expiry and clean-up on a case-by-case basis.
WARNING: When moving keys to sessionStorage or localStorage (mw.storage) beware that there is no cookieprefix by default. If values must vary by wiki, then wgCookiePrefix must be manually made part of the key.

See also:

Related Objects

StatusSubtypeAssignedTask
ResolvedKrinkle
Resolvedori
Declinedmatmarex
OpenNone
ResolvedAndyRussG
ResolvedNone
DeclinedNone
ResolvedAndyRussG
OpenNone
ResolvedAndyRussG
OpenNone
Resolvedori
Resolvedori
ResolvedMaxSem
ResolvedEsanders
ResolvedKrinkle
OpenNone
ResolvedxSavitar

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
forceHTTPSMediaWiki PHP30 daysKeep. Used server-side. httpOnly by default.

While it seems fine to have forceHTTPS as a feature in MediaWiki and that the feature requires the use of a cookie; Per @Tgr's comment, does it make sense for HTTPS-only wikis, such as ours? Perhaps we can have this cookie somehow omitted on HTTPS-only wikis.

forceHTTPSMediaWiki PHP30 daysKeep. Used server-side. httpOnly by default.

Note I didn't write the "Keep" bit, you did in T110353#3059304.

does it make sense for HTTPS-only wikis, such as ours? Perhaps we can have this cookie somehow omitted on HTTPS-only wikis.

I can't think of any particular need for it in that situation.

Change 340236 merged by jenkins-bot:
mediawiki.user: Move JS session token from cookie to sessionStorage

https://gerrit.wikimedia.org/r/340236

Should probably mention that I have three different StewardVoteEligible cookies. The name is the same (StewardVoteEligible). The content is the same (1). The domain is the same (sv.wikipedia.org). The only thing that differs is the path (/w, /wiki and /wiki/Anv%C3%A4ndare:Nirmos).

Should probably mention that I have three different StewardVoteEligible cookies. The name is the same (StewardVoteEligible). The content is the same (1). The domain is the same (sv.wikipedia.org). The only thing that differs is the path (/w, /wiki and /wiki/Anv%C3%A4ndare:Nirmos).

Fixed, revision 16372260.

Just so that it isn't forgotten: xxwikiwikibase-entity-usage-list is not included in https://gerrit.wikimedia.org/r/340243

Change 340243 merged by jenkins-bot:
[mediawiki/core] mediawiki.action.edit: Move collapsibleFooter cookies to localStorage

https://gerrit.wikimedia.org/r/340243

Just so that it isn't forgotten: xxwikiwikibase-entity-usage-list is not included in https://gerrit.wikimedia.org/r/340243

Thanks. Comes from Wikibase:/client/resources/wikibase.client.action.edit.collapsibleFooter.js, which was copied from core's collapsibleFooter.js.

Change 345084 had a related patch set uploaded (by Krinkle):
[mediawiki/extensions/Wikibase@master] collapsibleFooter: Move client-side state cookies to localStorage

https://gerrit.wikimedia.org/r/345084

Change 345084 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] collapsibleFooter: Move client-side state cookies to localStorage

https://gerrit.wikimedia.org/r/345084

Does somebody know why cookies CP, forceHTTPS or even WMF-Last-Access and VEE need to be stored for each subdomain separately? With current cookie limits such a multiplication of redundant cookies make it practically impossible for useful site-specific cookies to survive (for instance session, sitenotice dismissals).

forceHTTPS isn't stored for each subdomain separately, at least not when you're logged in with an SUL account which everyone should be by now. If we want to get rid of the cookie entirely on HTTPS-only wikis, the check for "is this an HTTPS-only wiki?" should probably be added to [[https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/session/CookieSessionProvider.php;ea89bd1cf6c9b95bf6e8728e5466277b042122a7$268|MediaWiki\Session\CookieSessionProvider::setForceHTTPSCookie()]] and [[https://phabricator.wikimedia.org/diffusion/ECAU/browse/master/includes/session/CentralAuthSessionProvider.php;0152d998c15bf3774560cd284329a44d7c62f9bb$419|CentralAuthSessionProvider::setForceHTTPSCookie()]], and either just return or set $set to false. For the CentralAuth case, ideally the test would be "are all wikis in this SUL-grouping HTTPS-only?", but that might be hard to determine unless we just add a flag specifically specifying that.

Apparently Analytics has some need for WMF-Last-Access to be per-subdomain, since they also have WMF-Last-Access-Global that's not per-subdomain.

I don't know what CP is used for, so I don't know whether it needs to be per-subdomain or not.

I also don't know whether people would complain if VEE started being per-subdomain and therefore started affecting the VE/not-VE behavior in some cross-wiki situations but not others. It looks like it's currently being used for some server-side logic, so changing it to localStorage may not work either.

If we want to get rid of the cookie entirely on HTTPS-only wikis, the check for "is this an HTTPS-only wiki?" should probably be added to...

And fix T118413 as right now Wikimedia wikis are not HTTPS-only as far as MediaWiki is aware.

Does somebody know why cookies CP, forceHTTPS or even WMF-Last-Access and VEE need to be stored for each subdomain separately?

WMF-Last-Access counts unique devices per wiki; setting it on a higher-level domain would defeat that purpose entirely.
VEE is probably expected to be configurable per-wiki, like all other user settings.
CP (connection properties) is for HTTP/2 support; I think it could be moved one domain higher.

VEE is probably expected to be configurable per-wiki, like all other user settings.

And like all user settings there's a desire for it to be global (T16950). If I manually set a "VEE" cookie for the *.wikipedia.org domain (e.g. via http://www.editthiscookie.com/ ) and I go edit a page, I still get the VisualEditor splashscreen popup (T136076) and, after clicking, a new "VEE" cookie is set for the subdomain. Can VisualEditor be instructed to follow what a cookie for the parent domain says and avoid duplicating it?

Except a cookie can't actually implement a global preference on Wikimedia sites since the sites are spread across multiple second-level domains. Look how much trouble CentralAuth has to go through to make login work.

Also, the "global with local override" model we use for user preferences cannot be easily implemented with cookies, as there is no way to tell which domain a cookie is set on, and if there are cookies on both en.wikipedia.org and .wikipedia.org, it's unspecified which will override the other.

Maybe the VEE cookie could be removed when the user is logged in (and so the setting comes from a user preference, not a cookie). Anything beyond that is not worth the effort IMO.

Now that I think of it, why is a cookie used in the first place for this preference? Can't it rely on a hidden preference like ULS does?

Imarlier renamed this task from Get rid of cookies to Audit use of cookies.Jan 18 2018, 3:43 PM

Now that I think of it, why is a cookie used in the first place for this preference? Can't it rely on a hidden preference like ULS does?

You already know this by now, but just for the record: VE mainly uses cookies for this because because logged-out editors don't have a place to store preferences. But for registered users, this should indeed be avoided. (Task T181933 was already updated to reflect this.)

CPVarnishConnection properties (e.g. HTTP/1.1 vs HTTP/2)SessionKeep. Only known server-side, needed client-side.

I've been noticing this quite a lot when debugging local storage and cookie use on Wikimedia sites. I think there's room to optimise this a bit.

Right now we're setting this on *all* Varnish responses, including outside text-frontend (unlike GeoIP and WMF-Last-Access), which as for design.wikimedia.org and for script inclusions from piwik.wikimedia.org.

I would propose to at least limit this to text varnishes, but looking more widely, I don't think we need this at all anymore. It was added for logging client-side in the Navigation Timing extension (2015 with aec1abe73c; T119014), but was removed in February of this year (b3eccc716; T186295).

Looking in Code Search, I don't see any other use of this cookie field in any JavaScript code indexed by it (search).

Change 437774 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/puppet@production] varnish: Remove setting of CP cookies

https://gerrit.wikimedia.org/r/437774

Change 437774 merged by Ema:
[operations/puppet@production] varnish: Remove setting of CP cookies

https://gerrit.wikimedia.org/r/437774

Imarlier subscribed.

@Krinkle to figure out what the next step we can do is.

Here are the cookies I have, if that helps:

centralauth_Session
centralauth_Token
centralauth_User
cpPosIndex
forceHTTPS
GeoIP
loginnotify_prevlogins
svwikircfilters-toplinks-collapsed-state
svwikiSession
svwikiUserID
svwikiUserName
UseCDNCache
UseDC
VEE
wikiEditor-0-booklet-characters-page
wikiEditor-0-booklet-help-page
WMF-Last-Access-Global
WMF-Last-Access

I've gone through the cookie table in the task description and explicitly marked those pending a decision as "Unsure". Those are the ones we have left to deal with and/or agree to keep.

@Nirmos Thanks:

  • centralauth_Session, centralauth_Token, centralauth_User: (See task description.)
  • forceHTTPS, GeoIP: (See task description.)
  • svwikiSession, svwikiUserID, svwikiUserName, VEE: (See task description.)
  • WMF-Last-Access-Global, WMF-Last-Access: (See task description.)
  • cpPosIndex, UseCDNCache, UseDC: Should only exist for a few seconds after a write action. Can you confirm that this does not exist after browsing around for 2 minutes without performing write actions? (e.g. no pref/watchlist changes, no edit/log actions).
  • loginnotify_prevlogins: I don't know about this one. We should look at this.
  • svwikircfilters-toplinks-collapsed-state: We should look at this.
  • wikiEditor-0-booklet-*-page: We should look at this.

cpPosIndex, UseCDNCache and UseDC definitely do not go away with two minutes of browsing. Also, I started my computer this evening (the 25th) and as you can see these cookies have an "Expires" date for the 24th, i.e. a date that has already occured.

Full information:

{
    "cpPosIndex": {
        "CreationTime": "Sun, 24 Jun 2018 19:11:49 GMT",
        "Domain": "sv.wikipedia.org",
        "Expires": "Sun, 24 Jun 2018 19:12:49 GMT",
        "HostOnly": true,
        "HttpOnly": true,
        "LastAccessed": "Sun, 24 Jun 2018 19:11:49 GMT",
        "Path": "/",
        "Secure": true,
        "sameSite": "Unset"
    },
    "UseCDNCache": {
        "CreationTime": "Sun, 24 Jun 2018 19:11:49 GMT",
        "Domain": "sv.wikipedia.org",
        "Expires": "Sun, 24 Jun 2018 19:11:59 GMT",
        "HostOnly": true,
        "HttpOnly": true,
        "LastAccessed": "Sun, 24 Jun 2018 19:11:49 GMT",
        "Path": "/",
        "Secure": true,
        "sameSite": "Unset"
    },
    "UseDC": {
        "CreationTime": "Sun, 24 Jun 2018 19:11:49 GMT",
        "Domain": "sv.wikipedia.org",
        "Expires": "Sun, 24 Jun 2018 19:11:59 GMT",
        "HostOnly": true,
        "HttpOnly": true,
        "LastAccessed": "Sun, 24 Jun 2018 19:11:49 GMT",
        "Path": "/",
        "Secure": true,
        "sameSite": "Unset"
    }
}

@Nirmos Hm.. That's interesting. How are you viewing this data exactly? I'm surprised to see cookies with an expiry date in the past.

Cookies are essentially bits of data that a website sends to a browser (not stored by the website, only by the browser). The website can use and associate this with your browser as long your browser keeps showing the same data every time you view a page on that website.

Once the browser stops showing the website this data, the cookie effectively no longer exists. It is possible that the browser is storing them somewhere internally for some amount of time, but that is beyond our control. Perhaps the browser is keeping them because of a "development" mode being enabled (to be able to monitor older cookies). Or perhaps the browser hasn't deleted them yet to save power and hard-drive usage. Either way, this is not something we can control.

Can you inspect the Network to see if these expired cookies are still in use? Here is an example of my browser showing cookies to the server on a page view.

capture.png (816×1 px, 200 KB)

How are you viewing this data exactly?

I used the Storage tab in Firefox. If I instead use the Network tab, cpPosIndex, UseCDNCache and UseDC are not present.

There is also svwiki-mw-tour, set to expire 1969-12-31T23:59:59.000Z (and yes, I actually can see this under Network → Cookies → Request Cookies in Chrome, not just under Application → Cookies).

Edit: To clarify, I can see the cookie in both places. The 1969-12-31T23:59:59.000Z expiry is only under Application → Cookies. Under Network → Cookies → Request Cookies, the expiry is N/A.

This has become a tracking task which isn't particularly useful or closeable

There are still issues described here that I think are worth addressing. I'm moving it to potential goals to consider turning into a quarterly theme with a limited scope (of just the currently known issues, potentially even less then that), to schedule our focus on for a given quarter in the future and close out after that.

The current focus theme for Q1 and Q2 is T202154. This could be a candidate for Q3, for example.

Krinkle claimed this task.

Considering resolve. T121646 has been updated to reflect that for most things LocalStorage is actually unsuitable and undesirable in terms of perf characteristics.

For the purpose of this task, we've gotten rid of some cookies we learned were not needed. And we migrated some things to use HTTP-only cookies or sessionStorage instead.

For the rest, per T121646, cookies are now once again the recommended strategy.