=== Purpose ===
{T370416}
=== Scope ===
We would like to continuously monitor (e.g. daily, weekly) the following metric for WDQS:
* number of SPARQL queries that only retrieve data of a known single entity (based on T370848)
* number of all other SPARQL queries
=== Desired output ===
[] Airflow pipeline to monitor the above metric
[] Output as CSV to https://analytics.wikimedia.org/published/datasets/wmde/analytics/
=== Notes ===
* We do not need 100% exact numbers, so it is okay to go for a random sample (e.g. every 100th query).
=== Open questions? ===
* What is the frequency we need (e.g. daily, weekly)?
=== Urgency ===
When this task should be completed by. If this task is time sensitive then please make this clear. Please also provide the date when the output will be used if there is a specific meeting or event, for example.
DD.MM.YYYY
---
**Information below this point is filled out by the Wikidata Analytics team.**
== General Planning ==
Information is filled out by the analytics product manager.
== Assignee Planning ==
Information is filled out by the assignee of this task.
=== Estimation ===
Estimate:
Actual:
=== Sub Tasks ===
Full breakdown of the steps to complete this task:
[ ] subtask
=== Data to be used ===
See [Analytics/Data_Lake](https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake) for the breakdown of the data lake databases and tables.
The following tables will be referenced in this task:
- link_to_table
=== Notes and Questions ===
Things that came up during the completion of this task, questions to be answered and follow up tasks:
- Note