Page MenuHomePhabricator

Ensure all ToolsDB databases comply with current naming conventions
Open, MediumPublic

Description

In T269513: Leaked enwp10 replica.my.cnf/WP 1.0 bot botpassword credentials to Github we discovered that the enwp10 has a legacy database on the ToolsDB servers that only has a single underscore rather than the expected double underscore between the db username and the dbname suffix.

There are other dbs with legacy naming, and they should be fixed.


Below you can find a list of the databases and db user names, as well as the owners/tool&maintainers.

We will reach out by email and send warnings until March 31, 2021, at which point the DBs will be dropped.

After the list you can find instructions on how to migrate your tables to a DB with a valid name.

If you have any concerns, problems, or special needs please reach out.


List of DBs with their owners/maintainers

Legend:

  • Should delete: (Y/N) based on owners' answers
  • Fixed/deleted: (Fixed/Deleted) based on actions taken on the DB
Should deleteFixed/DeletedDB userDB nameUserMaintainers
s51067s51067_cerabotcerabotCeradon
Ys51081s51081_limesrender-testsDaniel Kinzler, Jkroll, Kai Nissen (WMDE)
Ys51092s51092_countiluvatarbot1997kB, Iluvatar
Ys51092s51092_rsourcesiluvatarbot1997kB, Iluvatar
Ys51092s51092_raportsiluvatarbot1997kB, Iluvatar
Ys51100s51100_mass_translate_pdexbotLadsgroup
Ys51114s51114_enwp10enwp10Audiodude, Kelson
Deleteds51154s51154_hkmphotosfiwiki-toolsZache-tool
Deleteds51174s51174_maintuscMagnus Manske
Deleteds51187s51187_xtoolsxtoolsCyberpower678, Matthewrbowker, MusikAnimal, Samwilson
Deleteds51201s51201_wikiyifeibotEatcha, Multichill, Steinsplitter, Zhuyifei1999
Deleteds51223s51223_dblistsAlessio, Candalua, Fale, Incola, Mess
Ys51270s51270_wikidatastatsaudetoolsAude
Ys51375s51375_wikigabrielchihonglee-botGabrielchihonglee, Zhuyifei1999
s51381s51381_cropbotcropbotLuxo
Ys51441s51441_krdbotkrdbotKrd
Deleteds51541s51541_stewardbotstewardbotsBryanDavis, DerHexer, HakanIST, Hoo man, Jyothis, Linedwell, Matanya, Mat, Melos, Revi, Rxy, Shanmugamp7, Urbanecm
Deleteds51541s51541_sulwatcherstewardbotsBryanDavis, DerHexer, HakanIST, Hoo man, Jyothis, Linedwell, Matanya, Mat, Melos, Revi, Rxy, Shanmugamp7, Urbanecm
Ys51780s51780_supercountsupercountCyberpower678, Ladsgroup
NFixeds51835s51835_usagestatsawbMagioladitis, Reedy
NFixeds51835s51835_typoscanawbMagioladitis, Reedy
s51884s51884_wikibotciting-botKaribekov Vladislav Y.
Fixeds51892s51892_toolserverdb_ptoolserverdbJkroll, Nosy, Tobias Gritschacher
Fixeds51892s51892_toolserverdbtoolserverdbJkroll, Nosy, Tobias Gritschacher
Fixeds51922s51922_wlewleBase, RLuts
s51964s51964_denkmallistedenkmallisteAleXXw, Braveheart, Jean-Frédéric, Ruben Demus, Thomas Ledl
Ys52082s52082_limeslimesmapJkroll
Ys52452s52452_wikijourneywikijourneyPyguerder
Fixeds52462s52462_pswikidata-primary-sourcesAddshore, Denny Vrandecic, Hjfocs, SebastianSchaffert, Tpt
Should deleteFixed/DeletedDB userDB nameUser
u10570u10570_gapfindertoolsBmansurov
Deletedu1092u1092_epsg_pMerlijn van Deen
Deletedu1092u1092_testMerlijn van Deen
u2138u2138_wikiCoren
u2264u2264_cx_dbRyuch
u2264u2264_contenttranslationRyuch
Yu4529u4529_srwikiDungodung

How do I rename my database?

Approach 1: rename your tables into a new DB

  1. SSH to toolforge, become tool if necessary
  2. Make a backup of the DB in case something goes wrong
  3. cat replica.my.cnf to look at your user id
  4. sql tools to connect to ToolsDB
  5. CREATE DATABASE MYUSERORTOOLID__MYDBNAME; make your new DB with the new name
  6. USE MYUSERORTOOLID_MYOLDDBNAME; select the old DB
  7. SHOW TABLES; to see the tables and data you have in the old DB
  8. RENAME TABLE MYUSERORTOOLID_MYOLDDBNAME.table_name to MYUSERORTOOLID__MYDBNAME.table_name; rename each table form the old DB to the new one
  9. Finally feel free to use the new DB and poke around, all your data should be there.
  10. Update your tool / code to use the right DB name
  11. Once everything is working, download or remove the backup if you put it in Toolforge

Aproach n : ___

If you know of other safe correct ways of doing this please edit here.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I'd do something like: try to contact maintainers giving a deadline (2months?) -> delete non answered and agreed to delete dbs, the rest move to good named ones, unless it messes up with something else

Jhernandez renamed this task from How many ToolsDB databases violate current naming conventions? to Ensure all ToolsDB databases comply with current naming conventions.Jan 21 2021, 4:03 PM
Jhernandez updated the task description. (Show Details)

s51114_enwp10 is definitely still in use. I don't know how to rename it safely, so would need help with that.

Reasoning why the convention is suddenly enforced are missing. Instructions on how to fix are also missing.

I see some developers in the list who are a bit less active. Deleting their data while they are away doesn't sound like a very nice thing to do. I would suggest to just rename their database or mysqldump + gz it to their homedir. That way the tool might be broken, but the data is not lost.

Mentioned in SAL (#wikimedia-cloud) [2021-01-21T18:14:08Z] <Urbanecm> Stopping StewardBot and SULWatcher for maintenance (T269609)

Mentioned in SAL (#wikimedia-cloud) [2021-01-21T18:20:29Z] <Urbanecm> Start StewardBot and SULWatcher again, maintenance over (T269609)

@Audiodude and me are the maintainers. I got an email to inform me about this ticket.

As far as I understand you want to get rid of the DB s51114_enwp10, either by renaming it to s51114__enwp10 or by deleting it.

This database being actively used by hundreds of people on a daily base, I would really appreciate if this would not be deleted ;)

Therefore, we need to organize the renaming. What would be the best approach? Would you be able for example to allow our MySQL user to do it?

s51114_enwp10 is definitely still in use. I don't know how to rename it safely, so would need help with that.

! In T269609#6766246, @Multichill wrote:

[...] Instructions on how to fix are also missing.

I'm going to be adding these instructions to the description, here is a way to move your data from one DB to another:

  1. SSH to toolforge, become tool if necessary
  2. Make a backup of the DB in case something goes wrong
  3. cat replica.my.cnf to look at your user id
  4. sql tools to connect to ToolsDB
  5. CREATE TABLE MYUSERORTOOLID__MYDBNAME; make your new DB with the new name
  6. USE MYUSERORTOOLID_MYOLDDBNAME; select the old DB
  7. SHOW TABLES; to see the tables and data you have in the old DB
  8. RENAME TABLE MYUSERORTOOLID_MYOLDDBNAME.table_name to MYUSERORTOOLID__MYDBNAME.table_name; rename each table form the old DB to the new one
  9. Finally feel free to use the new DB and poke around, all your data should be there.
  10. Update your tool / code to use the right DB name
  11. Once everything is working, download or remove the backup if you put it in Toolforge

There is probably other ways too, this is just one I know about. Let me know if you need more help

Reasoning why the convention is suddenly enforced are missing. Instructions on how to fix are also missing.

The reasoning is just to keep things consistent and maintainable in the long term.

I see some developers in the list who are a bit less active. Deleting their data while they are away doesn't sound like a very nice thing to do.

Of course not, we are aware and we aren't going to delete any data any time soon. There are 2.5 months until the deadline, I've reached out by email and will do so a few more times before anything else happens. My hope is that we can get everything cleaned up and not have to take any action without a user being aware.

I'm also here to help you with this, please check the instructions, and make a backup of the DB before doing anything. And let me know if you have any issues. There is a lot of time.

I would suggest to just rename their database or mysqldump + gz it to their homedir. That way the tool might be broken, but the data is not lost.

Definitely, we don't want to lose any user data as much as we can. We also want people to maintain their software deployed in the platform and clean unused resources that can be useful to others for active use. Let's see how the process goes and if we can clean up the list.

@Audiodude and me are the maintainers. I got an email to inform me about this ticket.

As far as I understand you want to get rid of the DB s51114_enwp10, either by renaming it to s51114__enwp10 or by deleting it.

That's right! Moving the data to s51114__enwp10 is the way to go in your case then.

This database being actively used by hundreds of people on a daily base, I would really appreciate if this would not be deleted ;)

No worries, it won't. The deadline is 2.5 months away, and I'm sure we are going to figure it out a lot sooner.

Therefore, we need to organize the renaming. What would be the best approach? Would you be able for example to allow our MySQL user to do it?

I've posted an approach that works in the description, but I'm poking to see if there is an easier way to do it. I'll let you know with what I find out.

@Audiodude @Kelson Your DB seems to have a considerable size and 25 tables, so while you move the tables and update the code there will pass some time. I'd recommend you warn your users that there will be maintenance and the tools using the data won't be available for some time.

Hi, database s52452_wikijourney can be deleted. Regards

Hi, please delete u4529_srwiki, it's empty and useless.

Hey, please delete the dexbot one. For supercount, I'm maintainer as part of my duties as tools standard committee (context: T182341#4419472) Cyberpower should decide for that tool.

Database s51441_krdbot can be immediately deleted.

Jhernandez updated the task description. (Show Details)
Jhernandez updated the task description. (Show Details)

I've marked s52452_wikijourney, u4529_srwiki, s51100_mass_translate_p (dexbot) and s51441_krdbot for future deletion. Thanks for chiming in.

Hi, I have deleted the legacy database s51223_db.

Rerunning the above query:

MariaDB [information_schema]> SELECT SUBSTRING_INDEX(schema_name, '_', 1) AS dbuser, schema_name AS dbname FROM schemata WHERE schema_name REGEXP '^[su][0-9]+_[^_].+$' ORDER BY dbuser;
+--------+--------------------------+
| dbuser | dbname                   |
+--------+--------------------------+
| s51067 | s51067_cerabot           |
| s51081 | s51081_limes             |
| s51092 | s51092_count             |
| s51092 | s51092_rsources          |
| s51092 | s51092_raports           |
| s51100 | s51100_mass_translate_p  |
| s51114 | s51114_enwp10            |
| s51201 | s51201_wiki              |
| s51270 | s51270_wikidatastats     |
| s51375 | s51375_wiki              |
| s51381 | s51381_cropbot           |
| s51441 | s51441_krdbot            |
| s51780 | s51780_supercount        |
| s51884 | s51884_wikibot           |
| s51892 | s51892_toolserverdb      |
| s51892 | s51892_toolserverdb_p    |
| s51922 | s51922_wle               |
| s51964 | s51964_denkmalliste      |
| s52082 | s52082_limes             |
| s52452 | s52452_wikijourney       |
| s52462 | s52462_ps                |
| u10570 | u10570_gapfindertools    |
| u2138  | u2138_wiki               |
| u2264  | u2264_cx_db              |
| u2264  | u2264_contenttranslation |
| u4529  | u4529_srwiki             |
+--------+--------------------------+
26 rows in set (0.01 sec)

Shouldn't step 5 be CREATE DATABASE not CREATE TABLE?

Okay I've migrated s51114_enwp10 to s51114__enwp10. There are a few tables left in the old database, but they are leftovers from the tool migration process and can be deleted.

Indeed, thanks @Audiodude for catching the mistake and for migrating the DB.

I've fixed the instructions in the description and marked s51114_enwp10 for future deletion in the above table.

updated query:

MariaDB [information_schema]> SELECT SUBSTRING_INDEX(schema_name, '_', 1) AS dbuser, schema_name AS dbname FROM schemata WHERE schema_name REGEXP '^[su][0-9]+_[^_].+$' ORDER BY dbuser;
+--------+--------------------------+
| dbuser | dbname                   |
+--------+--------------------------+
| s51067 | s51067_cerabot           |
| s51081 | s51081_limes             |
| s51092 | s51092_count             |
| s51092 | s51092_rsources          |
| s51092 | s51092_raports           |
| s51100 | s51100_mass_translate_p  |
| s51114 | s51114_enwp10            |
| s51201 | s51201_wiki              |
| s51270 | s51270_wikidatastats     |
| s51375 | s51375_wiki              |
| s51381 | s51381_cropbot           |
| s51441 | s51441_krdbot            |
| s51780 | s51780_supercount        |
| s51884 | s51884_wikibot           |
| s51892 | s51892_toolserverdb      |
| s51892 | s51892_toolserverdb_p    |
| s51922 | s51922_wle               |
| s51964 | s51964_denkmalliste      |
| s52082 | s52082_limes             |
| s52452 | s52452_wikijourney       |
| s52462 | s52462_ps                |
| u10570 | u10570_gapfindertools    |
| u2138  | u2138_wiki               |
| u2264  | u2264_contenttranslation |
| u2264  | u2264_cx_db              |
| u4529  | u4529_srwiki             |
+--------+--------------------------+
26 rows in set (0.01 sec)

I've sent another batch of email reminders for DBs not marked on the description tables still showing up in the updated mysql query.

Re s51081_limes: just delete it, I'd be surprised if anyone has done anything with it in the past 8 years. Probably the same for s52082_limes.

I've marked s51081_limes and s51270_wikidatastats (email response) to be deleted.

I didn't s52082_limes since JKroll is the maintainer and daniel isn't there.

@daniel do you know if @jkroll is active? Or who I could ping to ask? His phabricator account is disabled so if he isn't around the DB is orphan.

Marked s51780_supercount to be deleted (email response from Ladsgroup).

@Jhernandez I renamed s51892_toolserverdb and s51892_toolserverdb_p.

Thanks for the reminders!

I didn't s52082_limes since JKroll is the maintainer and daniel isn't there.

@daniel do you know if @jkroll is active? Or who I could ping to ask? His phabricator account is disabled so if he isn't around the DB is orphan.

@jkroll isn't active anymore. I think the limes tool can be sunsetted, it's defunct for quite some while.

Thanks @Tobi_WMDE_SW, I've marked s52082_limes to be deleted (given Daniel and your comment), and s51892_toolserverdb and s51892_toolserverdb_p as fixed.

MariaDB [(none)]> use information_schema;
MariaDB [information_schema]> SELECT SUBSTRING_INDEX(schema_name, '_', 1) AS dbuser, schema_name AS dbname FROM schemata WHERE schema_name REGEXP '^[su][0-9]+_[^_].+$' ORDER BY dbuser;
+--------+--------------------------+
| dbuser | dbname                   |
+--------+--------------------------+
| s51067 | s51067_cerabot           |
| s51081 | s51081_limes             |
| s51092 | s51092_count             |
| s51092 | s51092_rsources          |
| s51092 | s51092_raports           |
| s51100 | s51100_mass_translate_p  |
| s51114 | s51114_enwp10            |
| s51270 | s51270_wikidatastats     |
| s51375 | s51375_wiki              |
| s51381 | s51381_cropbot           |
| s51441 | s51441_krdbot            |
| s51780 | s51780_supercount        |
| s51884 | s51884_wikibot           |
| s51892 | s51892_toolserverdb      |
| s51892 | s51892_toolserverdb_p    |
| s51922 | s51922_wle               |
| s51964 | s51964_denkmalliste      |
| s52082 | s52082_limes             |
| s52452 | s52452_wikijourney       |
| s52462 | s52462_ps                |
| u10570 | u10570_gapfindertools    |
| u2138  | u2138_wiki               |
| u2264  | u2264_contenttranslation |
| u2264  | u2264_cx_db              |
| u4529  | u4529_srwiki             |
+--------+--------------------------+
25 rows in set (0.01 sec)

Sent another batch of reminders, since there are 2 weeks left to the deadline.

These are the DBs with wrong names that haven't been marked to be deleted, renamed, or deleted.

| s51067 | s51067_cerabot           | cerabot                  | Ceradon                                                     |
| s51092 | s51092_count             | iluvatarbot              | 1997kB, Iluvatar                                            |
| s51092 | s51092_rsources          | iluvatarbot              | 1997kB, Iluvatar                                            |
| s51092 | s51092_raports           | iluvatarbot              | 1997kB, Iluvatar                                            |
| s51381 | s51381_cropbot           | cropbot                  | Luxo                                                        |
| s51922 | s51922_wle               | wle                      | Base, RLuts                                                 |
| s51964 | s51964_denkmalliste      | denkmalliste             | AleXXw, Braveheart, Jean-Frédéric, Ruben Demus, Thomas Ledl |
| s52462 | s52462_ps                | wikidata-primary-sources | Addshore, Denny Vrandecic, Hjfocs, SebastianSchaffert, Tpt  |
| u10570 | u10570_gapfindertools    | Bmansurov                |
| u2138  | u2138_wiki               | Coren                    |
| u2264  | u2264_cx_db              | Ryuch                    |
| u2264  | u2264_contenttranslation | Ryuch                    |

Any help contacting the maintainers would be appreciated.

Any help contacting the maintainers would be appreciated.

Ceradon, Ruben Demus, and Coren appear inactive.

@1997kB / @Iluvatar
@Base / @RLuts
@AleXXw / @Braveheart / @JeanFred / @ThomasLedl
@Addshore / @DVrandecic / @Hjfocs / @SebastianSchaffert / @Tpt
@bmansurov
@coren
@Ryuch
Action needed from you.

There are only a few days left until the end of March. I'll be sending an email reminder tomorrow again.

Another run of the query:

MariaDB [(none)]> use information_schema;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [information_schema]> SELECT SUBSTRING_INDEX(schema_name, '_', 1) AS dbuser, schema_name AS dbname FROM schemata WHERE schema_name REGEXP '^[su][0-9]+_[^_].+$' ORDER BY dbuser;
+--------+--------------------------+
| dbuser | dbname                   |
+--------+--------------------------+
| s51067 | s51067_cerabot           |
| s51081 | s51081_limes             |
| s51092 | s51092_count             |
| s51092 | s51092_rsources          |
| s51092 | s51092_raports           |
| s51100 | s51100_mass_translate_p  |
| s51114 | s51114_enwp10            |
| s51270 | s51270_wikidatastats     |
| s51375 | s51375_wiki              |
| s51381 | s51381_cropbot           |
| s51441 | s51441_krdbot            |
| s51780 | s51780_supercount        |
| s51884 | s51884_wikibot           |
| s51892 | s51892_toolserverdb_p    |
| s51892 | s51892_toolserverdb      |
| s51964 | s51964_denkmalliste      |
| s52082 | s52082_limes             |
| s52452 | s52452_wikijourney       |
| u10570 | u10570_gapfindertools    |
| u2138  | u2138_wiki               |
| u2264  | u2264_cx_db              |
| u2264  | u2264_contenttranslation |
| u4529  | u4529_srwiki             |
+--------+--------------------------+
23 rows in set (0.01 sec)

Reminders sent, next Wednesday March 31 is the deadline to act.

Remaining DBs to rename, delete or fix

DB userDB nameUserMaintainers
s51067s51067_cerabotcerabotCeradon
s51381s51381_cropbotcropbotLuxo
s51884s51884_wikibotciting-botKaribekov Vladislav Y.
s51964s51964_denkmallistedenkmallisteAleXXw, Braveheart, Jean-Frédéric, Ruben Demus, Thomas Ledl
DB userDB nameUser
u10570u10570_gapfindertoolsBmansurov
u2138u2138_wikiCoren
u2264u2264_cx_dbRyuch
u2264u2264_contenttranslationRyuch

While I’m not maintaining denkmalliste, I wanted to give a quick shot at fixing this − renaming the DB and replace the old name with the new in all source files.

However, following the steps, I am stuck at step 6 (use OLD_DB): "Access denied for user 's51964'@'%' to database 's51964_denkmalliste'

Please advise how to proceed.

Thanks for trying @JeanFred. I'm not sure what is happening but I'm going to ask and we will let you know. No action will be taken on this DB while we sort it out.

I did the following:

aborrero@clouddb1001:~$ sudo mysql -u root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 67241265
Server version: 10.1.39-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> GRANT ALL PRIVILEGES ON `s51964\_denkmalliste`.* TO 's51964'@'%';
Query OK, 0 rows affected (0.01 sec)

tools.denkmalliste@tools-sgebastion-08:~$ sql toolsdb
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 67238566
Server version: 10.1.39-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use s51964_denkmalliste;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [s51964_denkmalliste]> show tables;
+-------------------------------+
| Tables_in_s51964_denkmalliste |
+-------------------------------+
| DenkmalGemeinde               |
| Denkmalvorlage                |
| monuments_at_(de)             |
| wle_at_wp                     |
| wle_at_wp_objects             |
+-------------------------------+
5 rows in set (0.00 sec)

please @JeanFred let us know if that works.

I did the following:

aborrero@clouddb1001:~$ sudo mysql -u root
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 67241265
Server version: 10.1.39-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> GRANT ALL PRIVILEGES ON `s51964\_denkmalliste`.* TO 's51964'@'%';
Query OK, 0 rows affected (0.01 sec)

tools.denkmalliste@tools-sgebastion-08:~$ sql toolsdb
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 67238566
Server version: 10.1.39-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use s51964_denkmalliste;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [s51964_denkmalliste]> show tables;
+-------------------------------+
| Tables_in_s51964_denkmalliste |
+-------------------------------+
| DenkmalGemeinde               |
| Denkmalvorlage                |
| monuments_at_(de)             |
| wle_at_wp                     |
| wle_at_wp_objects             |
+-------------------------------+
5 rows in set (0.00 sec)

please @JeanFred let us know if that works.

Thanks @aborrero , that did the trick! All databases tables were renamed and occurrences replaced in code. https://denkmalliste.toolforge.org/ does not seem to be worse off than before so I’ll call it a success :-þ

Thanks for the support!

Thanks @JeanFred for looking into this, much appreciated.

As far as I understand the situation, the credentials for the DB on the tool denkmalliste didn't have access to the DB s51964_denkmalliste so I'm not sure if the tool isn't really using that DB or it was broken in some places, or maybe something else is happening. Anyway good that it is fixed.


Soon we'll be working on this by doing the following:

  • Deleting old DBs left over empty that were fixed/moved
  • Deleting DBs marked with Y in Should delete
  • For the other ones -just a few thankfully- we will be dumping the contents to a file in the home directory, and then proceed to delete the DBs

Thank you to all the maintainers who helped clean this up, very much appreciated!