There've been numerous discussions about logging client-side errors, which, for myriad reasons, have stalled. This proposal describes a practical approach to logging client-side errors using technologies that are already deployed and actively maintained by WMF and aims to leverage the work already being done on this infrastructure by SRE.
We propose that client-side errors are caught, normalised, encoded, and sent to a "beacon endpoint" (i.e. /beacon/error). Requests to that endpoint are tailed, and the associated information formatted and added to Kafka on a well-known topic. Logstash will then consume that stream of information and transform it as necessary.
Some well-known examples of this "requests to beacon endpoint to Kafka to `$consumer`" pipeline are [[ https://wikitech.wikimedia.org/wiki/Analytics/Systems/EventLogging | EventLogging ]] and [[ https://wikitech.wikimedia.org/wiki/Graphite#statsv | statsv ]].
== 1 Prior art
=== 1.1 Within WMF
==== 1.1.1 Readers Web's MinervaClientError metric
Readers Web began counting the number of client-side errors occurring for users using the Minerva skin in {T205582} as a result of a fairly-isolated discussion about the problem (see T167699). We now have some sense of how many client-side errors are occurring and how that varies over time: https://grafana.wikimedia.org/d/000000566/overview?orgId=1&from=now-30d&to=now&panelId=15&fullscreen
=== 1.2 Without WMF
==== 1.2.1 Sentry
TBD
== 2 Considerations
=== 2.1 Limitations in URL length
Requests made to a beacon endpoint are currently expected to use the HTTP GET method with the data in the URL's query string. However, we're limited to the amount of data we can include in the URL, e.g. [[ https://phabricator.wikimedia.org/diffusion/EEVL/browse/master/modules/ext.eventLogging/core.js$44 | EventLogging has a maximum URL size of 2000 ]].
However, unless the user is browsing the site with [[ https://www.mediawiki.org/wiki/ResourceLoader/Features#Debug_mode | ResourceLoader's debug mode ]] enabled (or ResourceLoader is made to support source maps), the majority of the information in a stacktrace can be safely discarded, e.g. consider the following stacktrace:
```lines=10
maybeLog @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:4
get @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:4
(anonymous) @ load.php?debug=false&lang=en&modules=ext.centralNotice.choiceData|ext.navigationTiming%2CwikimediaEvents|ext.quicksurveys.lib|ext.relatedArticles.readMore.bootstrap|ext.visualEditor.targetLoader|jquery%2Coojs-router|jquery.client|mediawiki.Title%2CUri%2Capi%2CjqueryMsg%2Clanguage%2Cuser%2Cutil|mediawiki.ui.anchor|mobile.init%2Csite%2Cstartup|mobile.messageBox.styles|mobile.pagelist.styles|mobile.pagesummary.styles|mobile.startup.images|mobile.startup.images.variants|skins.minerva.icons.images.scripts.misc|skins.minerva.icons.images.variants|skins.minerva.icons.page.issues.default.color|skins.minerva.icons.page.issues.medium.color|skins.minerva.icons.page.issues.uncolored|skins.minerva.mainMenu.icons%2Cstyles|skins.minerva.notifications%2Coptions%2Cscripts%2Ctalk%2Ctoggling|skins.minerva.notifications.badge|skins.minerva.options.share.icon&skin=minerva&version=0zmdnc2:409
mw.loader.implement.css @ load.php?debug=false&lang=en&modules=ext.centralNotice.choiceData|ext.navigationTiming%2CwikimediaEvents|ext.quicksurveys.lib|ext.relatedArticles.readMore.bootstrap|ext.visualEditor.targetLoader|jquery%2Coojs-router|jquery.client|mediawiki.Title%2CUri%2Capi%2CjqueryMsg%2Clanguage%2Cuser%2Cutil|mediawiki.ui.anchor|mobile.init%2Csite%2Cstartup|mobile.messageBox.styles|mobile.pagelist.styles|mobile.pagesummary.styles|mobile.startup.images|mobile.startup.images.variants|skins.minerva.icons.images.scripts.misc|skins.minerva.icons.images.variants|skins.minerva.icons.page.issues.default.color|skins.minerva.icons.page.issues.medium.color|skins.minerva.icons.page.issues.uncolored|skins.minerva.mainMenu.icons%2Cstyles|skins.minerva.notifications%2Coptions%2Cscripts%2Ctalk%2Ctoggling|skins.minerva.notifications.badge|skins.minerva.options.share.icon&skin=minerva&version=0zmdnc2:413
runScript @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:13
(anonymous) @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:14
flushCssBuffer @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:6
requestAnimationFrame (async)
addEmbeddedCSS @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:6
execute @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:15
doPropagation @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:7
requestIdleCallback (async)
requestPropagation @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:8
setAndPropagate @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:8
markModuleReady @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:13
runScript @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:14
(anonymous) @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:14
flushCssBuffer @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:6
requestAnimationFrame (async)
addEmbeddedCSS @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:6
execute @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:15
doPropagation @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:7
requestIdleCallback (async)
requestPropagation @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:8
setAndPropagate @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:8
markModuleReady @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:13
runScript @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:14
(anonymous) @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:14
flushCssBuffer @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:6
requestAnimationFrame (async)
addEmbeddedCSS @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:6
execute @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:15
doPropagation @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:7
requestIdleCallback (async)
requestPropagation @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:8
setAndPropagate @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:8
markModuleReady @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:13
runScript @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:14
(anonymous) @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:14
flushCssBuffer @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:6
requestAnimationFrame (async)
addEmbeddedCSS @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:6
execute @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:15
doPropagation @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:7
requestIdleCallback (async)
requestPropagation @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:8
setAndPropagate @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:8
implement @ load.php?debug=false&lang=en&modules=startup&only=scripts&skin=minerva&target=mobile:21
(anonymous) @ load.php?debug=false&lang=en&modules=ext.centralNotice.choiceData|ext.navigationTiming%2CwikimediaEvents|ext.quicksurveys.lib|ext.relatedArticles.readMore.bootstrap|ext.visualEditor.targetLoader|jquery%2Coojs-router|jquery.client|mediawiki.Title%2CUri%2Capi%2CjqueryMsg%2Clanguage%2Cuser%2Cutil|mediawiki.ui.anchor|mobile.init%2Csite%2Cstartup|mobile.messageBox.styles|mobile.pagelist.styles|mobile.pagesummary.styles|mobile.startup.images|mobile.startup.images.variants|skins.minerva.icons.images.scripts.misc|skins.minerva.icons.images.variants|skins.minerva.icons.page.issues.default.color|skins.minerva.icons.page.issues.medium.color|skins.minerva.icons.page.issues.uncolored|skins.minerva.mainMenu.icons%2Cstyles|skins.minerva.notifications%2Coptions%2Cscripts%2Ctalk%2Ctoggling|skins.minerva.notifications.badge|skins.minerva.options.share.icon&skin=minerva&version=0zmdnc2:1`
```
The above could be reduced to the following:
```lang=js, lines=10
// const st = ...
const st2 = st.replace( / @.+\n? /g, '' );
console.log( st.length, st2.length ); // 6841 788
// For completeness:
console.log( st2 );
// =>
// maybeLog
// get
// (anonymous)
// mw.loader.implement.css
// runScript
// (anonymous)
// flushCssBuffer
// requestAnimationFrame (async)
// addEmbeddedCSS
// execute
// doPropagation
// requestIdleCallback (async)
// requestPropagation
// setAndPropagate
// markModuleReady
// runScript
// (anonymous)
// flushCssBuffer
// requestAnimationFrame (async)
// addEmbeddedCSS
// execute
// doPropagation
// requestIdleCallback (async)
// requestPropagation
// setAndPropagate
// markModuleReady
// runScript
// (anonymous)
// flushCssBuffer
// requestAnimationFrame (async)
// addEmbeddedCSS
// execute
// doPropagation
// requestIdleCallback (async)
// requestPropagation
// setAndPropagate
// markModuleReady
// runScript
// (anonymous)
// flushCssBuffer
// requestAnimationFrame (async)
// addEmbeddedCSS
// execute
// doPropagation
// requestIdleCallback (async)
// requestPropagation
// setAndPropagate
// implement
// (anonymous)
```
=== 2.2 Burstiness
If a syntax error were introduced in a JavaScript asset that's delivered to all clients, then we'd see upwards of 5500 errors reported per second. The simplest way of dealing with this issue is to enable client-side error reporting for 1% of all pageviews.
We might also consider creating a service that acts as:
* A [[ https://en.wikipedia.org/wiki/Leaky_bucket | leaky bucket ]] if we don't want to permit bursts or [[ https://en.wikipedia.org/wiki/Token_bucket | token bucket ]] if we do (but at a fixed rate);
* A classful version of the above, if we want to "roll up" errors based on one or more normalized properties.
These services would allow us to maximise the number of clients that can report errors, thereby increasing the likelihood of capturing relatively low-rate errors. However, the introduction of any such service would require a long-term maintenance commitment from at least SRE.
=== 2.3 Pre-existing tools for exploration
In 1.1.1, it was noted that Readers Web are already counting the number of client-side errors for users using the Minerva skin. We do so using [[ https://wikitech.wikimedia.org/wiki/Graphite#statsv | statsv ]]. Currently, we leverage that [[ https://wikitech.wikimedia.org/wiki/Graphite#statsv | statsv ]] makes requests to a beacon endpoint and that the Analytics team provides tools like [[ https://turnilo.wikimedia.org/ | Turnilo ]], which allow us to explore request data at a high-level, to find trends in client-side errors, e.g. [[ https://turnilo.wikimedia.org/#webrequest_sampled_128/3/N4IgbglgzgrghgGwgLzgFwgewHYgFwhLYCmAtAMYAWcATmiADQgYC2xyOx+IAomuQHoAqgBUAwoxAAzCAjTEaUfAG1QaAJ4AHLgVZcmNYlO4B9E3sl6ACgqwATJXlUg7MGuiy4CVgOwARSy0dQnRiKHomcOJNfFIARgBfAF0EhjUg7nCaCGwAc0lDYwI3CBNNdEpJOHIMHG4cyTBEGDCVEAEAI2JqnAFw9CgwECSmbEx6PClEKGJU9O1MtGy8gqNuEpMARxaadSqaz25yHDQ4HKUmJoQWx2UQAHViDrEkYmw0HhoaTBph0fH8FMEDNkpFNEg0Ld5sEsjl8kw7BA2NgoIcCMd3jk3hEQFAfhNQIVuJQIJDJIjDAc6gQ7GFyG9EStUoQkaT8NgYAgEHNmBldEj9C4BSi0SAzBYmLl3ByELRSXtvCI4gAJSR4uj4QlrAjigXkiCU2peEBwKD07CM/LMpAsNl4GXckYgNimtytPCgaAAWU5GEB02IkThCGCJLJTBYvogbTDShSTE0ORIdj8wtROFuTsT2GTAGV8cTSRdCMRcgzNULkenjRiMCR3pIbXaAKzMkm5ShIDsTB0JIA== | a graph client-side errors for users using the Minerva skin by continent ]]. It's worth noting that if this proposal were implemented, this facility would still be available.