Some time ago, I wrote a bot that allows users to provide an SQL query and have the output neatly formatted on a wiki-page. It was discontinued after a few days of activity when it was pointed out it is contrary to Toolforge rule #6 "Do not provide direct access to Cloud Services resources to unauthenticated users”.
I am requesting an exception from this rule, as suggested in this discussion (cc @Legoktm @zhuyifei1999). Thanks for your consideration.
While this tool indeed allows anyone to trigger arbitrary queries on the enwiki replica db, the following measures are in place:
- It uses a max of 5 db connections at any time Due to its asynchronous nature, more connections aren't necessary - if there are more 5 concurrent users, the bot can still output all reports albeit taking more time.
- A timeout of 10 minutes is applied on every query, using MariaDB's max_statement_time directive.
No db connections are kept idle for more than 5 seconds (per policy). I believe these measures rule out any harm to the toolforge infrastructure.
Full function details:
- User places a template on a userspace/project page containing the SQL query and output formatting options. The bot detects new transclusions and writes the output. Periodic updates (with min interval of 1 day) can also be configured.
- An "Update the table now" link is provided in the template which triggers a manual update (intended for testing purpose, say after making changes in the formatting options).
What are the advantages of having this in addition to Quarry / existing database reports?
- Quarry doesn't provide clickable links to pages. Using something like CONCAT('[[', page_title, ']]'), then exporting Quarry result as wikitable and pasting it on-wiki is tedious.
- Even the above only works if pages are in the same namespace. Quarry doesn't provide any way to map namespace names to namespace numbers, so users are forced to ugly hacks like https://quarry.wmflabs.org/query/55915 and https://www.mediawiki.org/wiki/Topic:Ulxr3i1tc8uzgtgh to get readable results.
- For reports requiring periodic updates, currently different botops set up custom bot code for each. Having different code and job run setups for each report makes maintenance tedious and less than ideal.
Source code: https://github.com/siddharthvp/SDZeroBot/tree/master/db-tabulator
Enwiki approval request: https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval/SDZeroBot_10