Page MenuHomePhabricator

Deeper metrics for categories
Open, Needs TriagePublic

Description

Additionally to getting baselines for categories as part of T383138.

  • Get data on % low-level categories (filter out root categories) that are added month over month
  • Get data on newly created one page categories

Probable approach:

  • compile a list of hidden, maintenance, and non-diffusing categories
  • script to listen to the event stream, then use api to gather non-hidden/maintenance categories for each new UW upload (and record it)
  • keep file uploads that have edit tag ID = 12, i.e., UW
  • any non-hidden/maintenance category that has a wikidata item counts as a non-low-level category? or is there a way of telling if a category has no parent?
  • instead of low-level categories, we might look at a measure like the average depth in the tree