Change Details

=== Purpose === In the context of the WDQS Graph Splitting initiative we need to understand the consequences of selected splits. As a base for the respective evaluations we need a set of representative queries. === Scope === The goal of this task is to extract a representative sample of SPARQL queries from the Blazegraph query logs. The query set should be representative of the following characteristics: -* Query size -* Query time -* Status code (http return status) === Open questions === * Data source * Timeframe * Sample size * Output format * Urgency === Desired output === Description of the desired output for this task. > replace with the desired output === Urgency === When this task should be completed by. If this task is time sensitive then please make this clear. Please also provide the date when the output will be used if there is a specific meeting or event, for example. DD.MM.YYYY --- **Information below this point is filled out by the Wikidata Analytics team.** == General Planning == Information is filled out by the analytics product manager. == Assignee Planning == Information is filled out by the assignee of this task. === Estimation === Estimate: Actual: === Sub Tasks === Full breakdown of the steps to complete this task: [ ] subtask === Data to be used === See [Analytics/Data_Lake](https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake) for the breakdown of the data lake databases and tables. The following tables will be referenced in this task: - link_to_table === Notes and Questions === Things that came up during the completion of this task, questions to be answered and follow up tasks: - Note