Preamble
Ilario Valdelli reported that jobs of Argo WMCH are not generated anymore.
Problem
It seems this tool was designed on an undocumented feature of the Wikimedia Replicas, executing generic MySQL queries to generic databases (e.g. dewiki_p, itwiki_p) over a single database connection to the itwiki cluster, but this is not possible anymore, since that cluster does not have all our relevant wikis anymore.
Precisely, itwiki_p is available via itwiki.analytics.db.svc.wikimedia.cloud and dewiki_p via dewiki.analytics.db.svc.wikimedia.cloud ecc. and AFAIK other shares in the current 6 phisical clusters are not reliable - so relying on s2.analytics.db.svc.wikimedia.cloud should not be suggested too.
https://wikitech.wikimedia.org/wiki/News/Wiki_Replicas_2020_Redesign#New_host_names
Plan
- 2022-07-25 tool exploration (https://meta.wikimedia.org/wiki/Wikimedia_CH/Project/Argo_Wikimetrics)
- 2022-07-25 server access
- 2022-07-25 documenting application startup
- 2022-07-25 documenting database connections to Wikitech
- 2022-07-25 fix bastion with SSH fingerprint changed
- 2022-07-25 fix deprecated usage of <project>.analytics.db.svc.eqiad.wmflabs and adopt <project>.analytics.db.svc.wikimedia.cloud
- 2022-09-23 understand the best fix strategy (rewrite database connection logic)
- import source code from Synapta's GitHub to Wikimedia GitLab
Proposed Solutions
- Manual patch (2 points): instantiate lot of SSH tunnels from our server to Wiki Replicas, one for each required DB connection
- Semi-manual patch (4 points): same as above but with a script generating the tunnels
- Rewrite (32 points): rewrite the tool to do not rely on a single connection for all databases, but instantiate the right connection for the right database (this requires a handover of the project)
- Improve Wikimedia Cloud for the benefit of all its users (hoping that our workaround will not be needed anymore): T318191: Evaluate opening the readonly Wiki Replicas to the WAN (since we already have user authentication)
I've exposed the first 3 solutions to Ilario Valdelli and he opted for the solution n. 3 (probably because he loves things done right).