Page MenuHomePhabricator

Scale: GrowthExperiments wiki monitoring dashboard
Open, LowPublic

Description

As we scale to more wikis, it may be helpful for us to monitor several health indicators for each wiki. Because Growth features have a lot of project-specific dependencies, like mentors and templates, we want to notice when a specific project is having a problem.

This could take the form of a dashboard that could be publicly facing. It could be a wiki page or a spreadsheet or something else.
Here are some of the things we would want to know for each wiki:

  • Number of mentors signed up
  • Number of mentors who are answering questions at a lower rate or slower pace than some threshold (@Urbanecm has a script that parses talk pages to count responses. Spreadsheet output here).
  • Number of tasks available for each task type, for each topic
  • Number of suggested edits completed per newcomer
  • Revert rate of suggested edits
  • Number of mentor questions per newcomer
  • Which features are enabled or disabled
  • Number of errors, or whether parts of the feature are broken
  • The page could expose the current configuration, so we can visually check if it seems wrong

Event Timeline

There are two (not necessarily exclusive) ways that come to mind, a per-wiki special page and a Toolforge tool showing data for all wikis in a single view. It seems nice to be able to have an overview of all wikis so maybe Toolforge is the better option.

I don't think there's a nice way to find out which wikis have suggested edits enabled, but we could fetch the growthexperiments dblist from some git mirror, and see if the API endpoint exists for each.
After that, it's just 6 API queries per site, or a few hundred if we want to get the full (task type X topic) matrix. That's a lot so some form of caching would have to be involved.

MMiller_WMF renamed this task from GrowthExperiments task count dashboard to Scaling: GrowthExperiments wiki monitoring dashboard.Apr 21 2020, 3:57 PM
MMiller_WMF updated the task description. (Show Details)
MMiller_WMF added subscribers: Catrope, kostajh, marcella and 5 others.
MMiller_WMF renamed this task from Scaling: GrowthExperiments wiki monitoring dashboard to Scale: GrowthExperiments wiki monitoring dashboard.Apr 21 2020, 3:57 PM
MMiller_WMF updated the task description. (Show Details)
MMiller_WMF added a subscriber: Urbanecm.

Potentially we could leverage some initial exploration done for a similar project, the Community Health Metrics Kit. For this project, @alexhollender also created a prototype of how information could be imported into a dashboard in Google spreadsheet as an option.

Dashboard sheet eg
image.png (1×2 px, 744 KB)
Compare projects eg
image.png (1×2 px, 463 KB)
Compare metrics eg
image.png (976×1 px, 276 KB)

Idea from retro today (@kostajh @Catrope): perhaps the first version of a dashboard could allow users to input templates and see how many tasks will be available per topic with those templates. Then they could export that as a start to their on-wiki configuration for newcomer tasks. Naturally, we would want this for wikis that do not have our features enabled yet.

I'd like to propose a barebones version of this so that we can get T276795: Monitoring for GrowthExperiments link recommendation task pool done sooner rather than later:

  • create a new PHP project on toolforge, growthmonitoring or something like that
  • the project has a config file that we can add wiki IDs we're interested in monitoring
  • add an API endpoint (with 24 hour caching) to GrowthExperiments that
  • the growthmonitoring project has a script that runs each day, calls the API endpoint for each wiki we are monitoring, and writes the totals into a per-wiki JSON file with a new row for each day
  • we implement some basic chart / graph displays to show current status and trends with HTML + JS + CSS

EDIT: this is pretty similar to T249987#6048491, sorry I just scrolled back and saw that.

Change 674579 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] Monitoring: Add endpoint for metadata about suggested edit task pool

https://gerrit.wikimedia.org/r/674579

Another option would be pushing the counts into statsd. AIUI pushing a huge number of metrics to statsd is not nice, but we could 1) use it as a temporary solution while there are only 4 target wikis, 2) instead of a separate metric for each of the 64 ORES topics, have some sort of buckets (topics with 500+ tasks, topics with 400-499 tasks etc). Instead of an API we'd have to use a cronjob, or maybe push the numbers whenever someone actually searches for them (although that is more fragile). Graphing/dashboarding options would be more constrained but less work (via Grafana), and we'd get alerts for free.

Prometheus supports multi-dimensional data points so maybe that could be a longer term option? I don't know anything about Prometheus though so that's just a blind guess.

Another option would be pushing the counts into statsd. AIUI pushing a huge number of metrics to statsd is not nice, but we could 1) use it as a temporary solution while there are only 4 target wikis, 2) instead of a separate metric for each of the 64 ORES topics, have some sort of buckets (topics with 500+ tasks, topics with 400-499 tasks etc). Instead of an API we'd have to use a cronjob, or maybe push the numbers whenever someone actually searches for them (although that is more fragile). Graphing/dashboarding options would be more constrained but less work (via Grafana), and we'd get alerts for free.

Prometheus supports multi-dimensional data points so maybe that could be a longer term option? I don't know anything about Prometheus though so that's just a blind guess.

We discussed this yesterday. @Tgr will pursue the statsd approach and in parallel we'll also seek to expose some stats via an API endpoint that could be consumed by a HTML/JS app that produces some charts. I'll make subtasks.

Number of tasks available for each task type, for each topic

I'm making a subtask for this one.

Change 674579 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Monitoring: Add endpoint for metadata about suggested edit task pool

https://gerrit.wikimedia.org/r/674579

This could take the form of a dashboard that could be publicly facing. It could be a wiki page or a spreadsheet or something

I've started a static site that would live on Toolforge using Vue + Charts.js to consume the output of the /growthexperiments/v0/suggestions/info API endpoint. We could add other endpoints to expose the other data mentioned in this task.