Page MenuHomePhabricator

Statsd implementation of suggested edits task pool
Closed, ResolvedPublic

Description

Another option would be pushing the counts into statsd. AIUI pushing a huge number of metrics to statsd is not nice, but we could 1) use it as a temporary solution while there are only 4 target wikis, 2) instead of a separate metric for each of the 64 ORES topics, have some sort of buckets (topics with 500+ tasks, topics with 400-499 tasks etc). Instead of an API we'd have to use a cronjob, or maybe push the numbers whenever someone actually searches for them (although that is more fragile). Graphing/dashboarding options would be more constrained but less work (via Grafana), and we'd get alerts for free.

Prometheus supports multi-dimensional data points so maybe that could be a longer term option? I don't know anything about Prometheus though so that's just a blind guess.

  • Check with Ops about feasibility of this approach
  • implement server-side code
  • set up dashboard

Event Timeline

17:33 < tgr_> is there a rule of thumb on how many statsd metrics are too many?
17:36 < tgr_> background: I want to monitor an extension that maintains task pools for the 64 ORES topics. Is 64 new metrics per wiki excessive? (For now it's 4 wikis, I imagine we'll need to find something 
              better when we deploy to lots of wikis, but that's another quarter or two.)
17:37 < tgr_> AIUI Prometheus would be ideal for this kind of thing as data points could be tagged by wiki and topic, but there's no Prometheus integration in MediaWiki, right?
17:48 < shdubsh> tgr_: the rule of thumb AIUI is knowing the cardinality of the metrics.  Unlimited cardinality (like username, ip, or request id in the metric name) causes problems.
17:49 < shdubsh> You are correct, a MediaWiki Prometheus solution is a WIP.
17:50 < tgr_> does a fixed number per wiki count as limited?
17:50 < tgr_> (ie, we'd need to split by wiki)
17:52 < shdubsh> Yes, I think that counts.  We split by wiki many places.
17:52 < shdubsh> An upper bound of 64 per wiki seems reasonable to me.

Note, we want this to happen before we start populating the task pools for the remaining target wikis, so we can monitor how long it takes.

Change 674992 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Add listTaskCounts.php option to report counts to statsd

https://gerrit.wikimedia.org/r/674992

Patch is merged, next we need to add a visualization in Grafana. We could add some panels to https://grafana.wikimedia.org/d/vGq7hbnMz/special-homepage-and-suggested-edits or make a new dashboard, I don't feel too strongly about it either way.

Change 674992 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Add listTaskCounts.php option to report counts to statsd or as JSON

https://gerrit.wikimedia.org/r/674992

Change 675528 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):
[mediawiki/extensions/GrowthExperiments@master] Skip statsd in listTaskCounts.php when there's no link recommendation data

https://gerrit.wikimedia.org/r/675528

Change 675528 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Skip statsd in listTaskCounts.php when there's no link recommendation data

https://gerrit.wikimedia.org/r/675528

Change 675544 had a related patch set uploaded (by Gergő Tisza; author: Gergő Tisza):
[operations/puppet@production] Run GrowthExperiments listTaskCounts.php script every hour

https://gerrit.wikimedia.org/r/675544

Change 675544 merged by Alexandros Kosiaris:
[operations/puppet@production] Run GrowthExperiments listTaskCounts.php script every hour

https://gerrit.wikimedia.org/r/675544

Doesn't seem to work on beta, but then most things don't. Should start running on testwiki in an hour or two.

Dashboard: https://grafana.wikimedia.org/d/vGq7hbnMz/special-homepage-and-suggested-edits?orgId=1 (at the bottom).

Given the large number of metrics (we have a task count for each of the 64 ORES topics for every wiki) not sure what visualisation option would make sense; currently it shows the total number of tasks on each wiki + the number of topics per wiki with less than 250 tasks.

Dashboard: https://grafana.wikimedia.org/d/vGq7hbnMz/special-homepage-and-suggested-edits?orgId=1 (at the bottom).

Given the large number of metrics (we have a task count for each of the 64 ORES topics for every wiki) not sure what visualisation option would make sense; currently it shows the total number of tasks on each wiki + the number of topics per wiki with less than 250 tasks.

Nice! That looks good for now. @MMiller_WMF @Trizek-WMF is there a logical place on mediawiki.org/wiki/Growth or subpage of that where we could place a link to this dashboard?

I think this deserved a specific page about monitoring. :)

I think this deserved a specific page about monitoring. :)

You mean a user-facing one? I think that would be the special page and toolforge tool from T249987: Scale: GrowthExperiments wiki monitoring dashboard, this one is mostly for developers.

The request from Kosta is to know where to put it on our set of pages on mediawikiwiki. So I think we should have a mage like Growth/Monitoring, that would present the Grafana graphs and the monitoring dashboard. They are both tools communities can use.

FWIW the Grafana board is linked from https://wikitech.wikimedia.org/wiki/Add_Link which is the more common wiki to link Grafana boards from. No reason not to add the information elsewhere, of course.