Wikidata Analytics Request
This task was generated using the Wikidata Analytics request form. Please use the task template linked on our project page to create issues for the team. Thank you!
Purpose
We need the following baseline metrics for data dumps per month:
- # downloads (full v partial across the different types)
- # downloads per unique ip x user agent
- # unique user agents
- Size of a full data dump in GB
Desired Outputs
Please list the desired outputs of this task.
Baseline metrics for Wikidata data dumps
Deadline
Please make the time sensitivity of this task clear with a date that it should be completed by. If there is no specific date, then the task will be triaged based on its priority.
04.08.2025
Information below this point is filled out by the task assignee.
Assignee Planning
Sub Tasks
A full breakdown of the steps to complete this task.
- Explicitly define the metrics needed
- Write all create table queries for the DAG jobs
- Write all the computation queries for the DAGs
- Test query processes in local DB schema
- Write DAG(s)
- Test DAG(s)
- Deploy DAG(s)
Estimation
Estimate: 2-3 days
Actual:
Data
The tables that will be referenced in this task.
- link_to_table
Notes
Things that came up during the completion of this task, questions to be answered and follow up tasks.
- Note