Until recently, I could use USE nds_nlwiki_p; to access the database for nds-nl.wikipedia.org. But now when I enter "nds_nlwiki" in the unlabeled text box and click 'Submit Query', I get a pop-up saying "Bad database name". I guess the new code doesn't like the underscore?
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Expand dbname validation regex | analytics/quarry/web | master | +76 -6 |
Event Timeline
I think I've found the code location of the problem: the regex at quarry/web/app.py#241 is too restrictive. There are also database names with numbers, so simply adding an underscore won't be enough.
Also, is there any harm in trying to run a query on a validly-named but non-existing database? Users currently get a "Can't connect" message. If no harm, then I propose allowing any (semi-)valid hostname label to prevent future problems, so something like: r"^[\-0-9_a-z]+$".
The current regex excludes these dbnames from all.dblist:
arbcom_cswiki arbcom_dewiki arbcom_enwiki arbcom_fiwiki arbcom_nlwiki arbcom_ruwiki bat_smgwiki be_x_oldwiki cbk_zamwiki fiu_vrowiki id_internalwikimedia map_bmswiki nds_nlwiki noboard_chapterswikimedia otrs_wikiwiki pa_uswikimedia roa_rupwiki roa_rupwiktionary roa_tarawiki sysop_itwiki test2wiki wg_enwiki zh_classicalwiki zh_min_nanwiki zh_min_nanwikibooks zh_min_nanwikiquote zh_min_nanwikisource zh_min_nanwiktionary zh_yuewiki
Adding _ to the allowed prefix characters reduces this list to:
test2wiki
All current contents of all.dblist are matched with:
r"^(?:(?:centralauth|meta|[0-9a-z_]*wik[a-z]+)(?:_p)?)|quarry?$"
Change 676846 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):
[analytics/quarry/web@master] Expand dbname validation regex
Change 676846 merged by jenkins-bot:
[analytics/quarry/web@master] Expand dbname validation regex
Mentioned in SAL (#wikimedia-cloud) [2021-04-07T21:06:15Z] <bstorm> deploying regex fixes T278715