mw.track() is a frontend JS feature that collects metrics produced in browsers and forwards them to Graphite.
This task is complete when there is a migration strategy in place to migrate metrics produced by mw.track() into Prometheus.
mw.track() is a frontend JS feature that collects metrics produced in browsers and forwards them to Graphite.
This task is complete when there is a migration strategy in place to migrate metrics produced by mw.track() into Prometheus.
mw.track() is a low-level JavaScript utility for connecting any two application components to each other via a topic string (loose coupling, no runtime dependency, no load order concern). The data is kept in an in-memory list for the duration of a page view, and can be consumed at any time by zero or more handlers via mw.trackSubscribe(topic, function).
It's about 20 lines of code and contains no "business logic" and is not where migration or faccilities would reside (it also emits no stats, metrics, or events by itself).
mw.track is used, for example, to propagate errors from widgets in the user interface, and to expose event objects (some of which EventLogging then consumes and sends to EventGate/Kafka), and for various abstract APIs within the JavaScript runtime, some of which may eventually produce one or more statistics.
One user of mw.track() is the WikimediaEvents extension (specifically statsd.js), which comprises about 30 lines of code, that define the counter.* and timing.* topics via mw.trackSubscribe(), where the handler turns these into an HTTP request to a /beacon/statsv endpoint. You can think of this as the JavaScript analog to the PHP MediaWiki-libs-Stats, except that statsd.js is much much simpler. It is an optional extension (WikimediaEvents) that directly dispatches an HTTP request.
https://wikitech.wikimedia.org/wiki/Graphite#statsv
This endpoint in turn is received by varnishkafka, and eventually consumed by statsv.py, which runs the on the Webperf host (now operated by SRE Observability). This consumes from kafka and forwards to Statsd/Graphite.
In 2017 we started discussing at T180105 how this might work in a secure way with Prometheus, such that we don't allow the world to create, muddy, or pollute data in unrelated metrics. I imagine this could take the form of a dedicated Prometheus instance (prometheus/ext, this part is already completed by SRE) and something else to avoid abuse causing visible conflicts in Grafana/Thanos. Perhaps something as simple as an enforced prefix like mwjs_.
Similar to the migration of the server-side stats from MediaWiki PHP, I believe the JS side is similarly handled per-component as well. This task is specifically for ensuring we have an alternative in the platform - not the migration of individual stats themselves.
Thank you for the detailed write up on this @Krinkle ! See below for my take:
Of course the proper solution is to have statsd.js / mw.track support for Prometheus metrics as you mentioned. I'll try and answer some of the open questions from my POV (some braindump ahead)
HTH
Using the path as a signal to statsv.py which decoder to use feels reasonable. Effectively: /beacon/<format> which statsv.py (or something) reads and decides which decoder to use and/or what to do next.
- Export. Do we want statsd.py to be its own prometheus producer, like navtiming.py, or do we want it merely pass things on (after prefix enforcement) to a statsd-exporter? Or something different?
We have more control and is a simpler pipeline if statsv.py did the decoding and Prometheus exporting itself, however it's saves us a fair bit of work if we "outsource" the Prometheus client and state management to a separate exporter. I see pros and cons to both solutions.
I did a bit of looking into OpenMetrics. AFAICT, it's a wire format that would be produced by a /metrics endpoint and decoded by a Prometheus server. AFAICT, it's not intended to be emitted as events.
Have I missed something?
For awareness, see also https://phabricator.wikimedia.org/T359178#9640223 re: statsv in the context of varnishkafka deprecation/removal.
Indeed. We need two things:
This JS signature could be something like this:
// Old mw.track( 'counter.MediaWiki.example_this.foo.bar', 1 ); mw.track( 'timing.MediaWiki.example_that.foo', 42 ); // New? mw.track( 'stats_counter.mediawiki_example_this', [ 1, { x: 'foo', y: 'bar' } ] ); mw.track( 'stats_timing.mediawiki_example_that', [ 42, { x: 'foo' } ] );
The idea being:
Optionally, one could create add abstracion layer on top. However, it's code would have to be very tiny so as to be able to ship it in the critical path. The reason we use mw.track() in the first place is so that producing stats is dependency-free. The code to wire it up and send it to /beacon can load asynchronously with low priority. That's where we can (within reason) do a lot more. That code is still loaded on every pageview as well, but it's loaded later and strongly cached behind its own versioned URL, instead of baked into the base payload.
mw.stats.counter( name, increment, labels = {} ); mw.stats.counter( 'mediawiki_example_this', 1, { x: 'foo', y: 'bar' } );
The HTTP query string format will need to be something fairly simple that we can trivially create in JavaScript (WikimediaEvents/statsd.js, where we consume the subset of mw.track topics that relate to stats). And then encode, fit within, and transmit over an HTTP beacons' query string (so ideally fairly short and without chars that require verbose encoding). And then easily decoded in statsv.py (and potentially discard invalid stuff), and turn into calls to the prometheus_client object buffer in the Python process, which then offers it up for scraping in the real Prometheus/OpenMetrics format.
The OpenMetrics line format might be suitable for the transport as well, but it would indeed be arbitrary. It wouldn't actually be pass to Promethes by statsv.py as-is. We'd only pretend to :)
We currently use this format:
/beacon/statsv?MediaWiki.example_this.foo.bar=1c&MediaWiki.example_that.foo…
/beacon/smth?mediawiki_example_this{x="foo",y="bar"} 1\nmediawiki_example_that # actual fetch(), confirm via browser DevTools/Network/Request/Headers # /beacon/smth?mediawiki_example_this{x=%22foo%22,y=%22bar%22}%201%0Amediawiki_example_that
Both double quotes and spaces are illegal in URLs and thus forcefully encoded, even if you set a raw query string. Parsing quoted values would also require less trivial parser on the other end. Using this protocol might give the wrong impression as we'd presumably only support a very very narrow partion of that spec. Note that line breaks are legal in query strings, they can be encoded as encodeURIComponent('\n') == "%0A"
/beacon/smth?mediawiki_example_this=1;x=foo;y=bar&mediawiki_example_that
This would be fairly trivial to parse (split by ampersand, then by semicolon and equal sign). The user-generated values could be encoded via encodeURIComponent, which naturally percent-encodes any actual = equal sign or ; semi colon, if there was one.
Feel free to use as starting point :)