WE3.3.7 Year in Review and Activity Tab Services
If we leverage the data platform’s processing capabilities to aggregate tailored editor metrics and impact data and serve the aggregated data through suitable services with defined SLOs, we can enhance future iterations of Year in Review WE3.3.1 and Activity Tab WE3.3.2.
Year in Review (YiR), Mobile Apps Activity Tab, and the Growth Impact Module are 3 projects that require global editor statistics.
The metrics required for each of these projects are very similar. The differences are mostly about time spans and granularities.
This will be the parent task for:
- Designing the datasets to meet the requirements
- Creating the data pipelines to generate the datasets: T405039: Global Editor Metrics - Data Pipeline
- Design and creation of storage for serving: T401260: Global Editor Metrics - Data Persistence Design Review
- Design and deployments of API endpoints: T405041: Global Editor Metrics - HTTP API endpoints
Working document: FY25-26 Year in Review and Impact Module - Notes & Product Requirements
Metric requirements summary
Canonical requirements are in FY25-26 Year in Review and Impact Module - Notes & Product Requirements
- Total global edit count per user
- Total number of days edited per user
- Longest daily edit streak per user
- List of edited articles per user
- Total number of pageviews on all articles edited by a user
- Top K views to articles edited by a user per month
Year in Review needs these metrics rolled up for an entire calendar year. Impact Module and App Activity Tab would like daily roll ups, and ideally at a daily computation frequency.
Privacy review
Most metrics are clearly public. However, List of edited articles per user, if not historically updated, has the potential to expose deleted (and privacy sensitive?) page ids / page titles.
LCS3 review for this has been completed. Summary:
exposing only the MediaWiki internal page_id via the public API and not the page_title satisfactorily mitigates privacy risks.