The goal is to make a pubicly available dump of Quarry queries to enable potential fine-tuning of AI models to try to generate a Wikimedia replica query based on a natural language description.
The dataset would include three fields:
- Description/title of the query
- Database
- SQL syntax
The dataset would be released under CC0, which is what the SQL syntax currently is released as.