WMDE Analytics Request
This task was generated using the WMDE Analytics request form. Please use the task templates linked on our project page to create tasks for the team. Thank you!
Why (Context & Decision)
What problem are you trying to solve, and what decision will this analysis inform? Briefly explain the organisational or strategic context, why this matters now, and what action will be taken based on the outcome.
As a part of T427512: [EPIC] [WDQM] Wikidata Quality Metric, we want to measure the completeness of references on Wikidata's data.
What (Scope & Output)
Describe the specific question(s), metrics, segments, or deliverables (e.g., dashboard, deep dive, experiment analysis), including any relevant definitions or constraints.
- A measure of the inclusion of citations on properties that require them on Wikidata
- A monthly pipeline that uses the most up to date weekly wmf.wikidata_entity snapshot
- A chart for this metric over time on Superset
- Will be at Superset:wd_quality_metrics
By When (Timing & Priority)
Provide a clear deadline, any key milestones (e.g. launch, leadership review), and note if timing is flexible or fixed.
05.06.2026
Information below this point is filled out by the analyst.
Sub Tasks
A breakdown of the steps to complete this task.
- Write create table query
- Write query to derive metric
- Test table creation and metric query
- Write DAG for metric computation
- Test DAG for metric computation (with T427513)
- Deploy DAG
- Create chart on Superset (with T427513)
- Will be at Superset:wd_quality_metrics
- Finalize metric computation for all available weeks (we only have data for May 2026)
Estimation
Estimate: 1d
Actual:
Data
The tables that will be referenced in this task and the sample sizes from them that will be used.
- wmf.wikidata_entity
- discovery.wikibase_rdf