Page MenuHomePhabricator

Change database access code to work with replicas redesign
Open, MediumPublic

Description

Currently querying databases rely on establishing connection to one toolsdb and one wikidb on analytics. Soon this scheme will stop working, so in February we have to make the migration from current setup to the updated one.

Also, this affects command line arguments and establishing connection. Might make sense to create shell script or use python + some config file to open all the ports instead of typing so many arguments.

Subtasks

  • Create updatable mapping of current shard location (looks like have to use noc, as meta_p is not updated when shards are moving)
  • Write helper script for opening necessary amount of ports directly from python
  • How to request which ports to use? Maybe some user-config file will do?

Event Timeline

LostEnchanter created this task.
LostEnchanter updated the task description. (Show Details)

The update is live! See T272523 for the new connection scheme.

To find out, which databases are stored, there's few options:

  1. Connect to meta_p and get the value of the column slice in table wiki
  2. Connect to each cluster and do show databases
  3. Get clustering results from https://noc.wikimedia.org/conf/, where they are stored in dblists/s<number>.dblist

As we are querying meta either way, the easy solution is to store cluster mapping in Sources, but it might be better to submit patch to Toolforge Python library... Or at least talk to its developer about this update.

@LostEnchanter Thats great! I believe we connect to the shards locally, and in our scripts we match and connect to appropriate shard? I am not sure what we ask toolforge library devs since when we connect locally we use pymysql anyways. I see they have made some changes wrt this recently, those changes may be worth a look.
Storing in Sources table is brilliant if we want to do it ourselves!

Update status:

Port forwarding from Python is way less trivial when I expected - partly probably because of lack of experience with pure ssh, partly because of lack of examples I could find. If anyone could make port forwarding to the databases work, for example, using Fabric or sshtunnel, and could connect with me with their code, I would be grateful, as searching for errors I get when trying to forward ports returns nothing.

For now working on this is feature is postponed as having lower priority then web service.

LostEnchanter lowered the priority of this task from High to Low.Feb 15 2021, 1:46 PM
LostEnchanter raised the priority of this task from Low to Medium.

@LostEnchanter: Hi! This task has been assigned to you a while ago. Could you maybe share an update? Do you still plan to work on this task, or do you need any help?

If this task has been resolved in the meantime: Please update the task status (via Add Action...Change Status in the dropdown menu).
If this task is not resolved and only if you do not plan to work on this task anymore: Please consider removing yourself as assignee (via Add Action...Assign / Claim in the dropdown menu): That would allow others to work on this (in theory), as others won't think that someone is already working on this. Thanks! :)