Superset is SQL query interface to Wikimedia database replica. It has API which allows to run SQL queries over https and it would be nice to have local SQL query access from pywikibot without need of wikimedia cloud accounts. With superset user needs only give OAUTH permission.
* https://wikitech.wikimedia.org/wiki/Superset
* https://superset.wmcloud.org/
* https://phabricator.wikimedia.org/project/profile/6584/
Howto use
# add mediawiki test user for meta.wikimedia.org to user-config.py
# login to meta.wikimedia.org
# login to superset https://superset.wmcloud.org/login and give OAUTH permissions
# python -m unittest -v tests.superset_tests
**mybot.py**
```
import pywikibot
from pywikibot import pagegenerators
from pywikibot.bot import ExistingPageBot
class MyBot(ExistingPageBot):
def treat_page(self):
"""Load the given page, do some changes, and save it."""
print(self)
def main():
"""Parse command line arguments and invoke bot."""
options = {}
gen_factory = pagegenerators.GeneratorFactory()
# Option parsing
local_args = pywikibot.handle_args() # global options
local_args = gen_factory.handle_args(local_args) # generators options
MyBot(generator=gen_factory.getCombinedGenerator(), **options).run()
if __name__ == '__main__':
main()
```
`python pwb.py mybot.py -family:commons -lang:commons -supersetquery:"SELECT page_namespace, page_title FROM page LIMIT 10"`
Example SQL queries
- mandatory return columns for creating are: `page_id` OR `page_namespace` and `page_title`
- if `page_wikidb` is defined then it is used as site. Example values are `enwiki` or `enwiki_p` or `commonswiki` or `commonswiki_p`.
-- Target wiki is resolved from string using [[ https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.site.html#pywikibot.site._apisite.APISite.fromDBName | pywikibot.site.APISite.fromDBName ]]()
```
SELECT gil_wiki AS page_wikidb, gil_page AS page_id FROM globalimagelinks GROUP BY gil_wiki LIMIT 10
```
```
SELECT page_id FROM page LIMIT 10
```
```
SELECT page_namespace, page_title FROM page LIMIT 10
```
Future work
- T367393 - In future Superset should be able to query ToolsDB public databases also and this needs to be added.