Page MenuHomePhabricator

Drop CookieBlock* tables from EventLogging DB
Closed, ResolvedPublic5 Estimated Story Points

Description

These were used to verify that cookie blocks work and contain user IPs. Logging was removed in https://gerrit.wikimedia.org/r/#/c/365414/ , and due to these tables containing highly private data, I'm asking to not back them up before dropping.

mysql:research@analytics-store.eqiad.wmnet [log]> show tables like 'Cookie%';
+-------------------------------+
| Tables_in_log (Cookie%)       |
+-------------------------------+
| CookieBlock_16046548          |
| CookieBlock_16241436          |
| CookieBlock_16241436_15423246 |
+-------------------------------+
3 rows in set (0.00 sec)

mysql:research@analytics-store.eqiad.wmnet [log]> select count(*) from CookieBlock_16046548 union select count(*) from CookieBlock_16241436 union select count(*) from CookieBlock_16241436_15423246;
+----------+
| count(*) |
+----------+
|      310 |
| 15771712 |
|      988 |
+----------+
3 rows in set (5.66 sec)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Marostegui subscribed.

Will Analytics handle this?
I would suggest to rename the tables on all the hosts and leave them with a different name before dropping them, to make sure nothing breaks.

As per the backups...I am fine with whatever you guys think it is the right way to proceed.

Marostegui triaged this task as Medium priority.Jul 31 2017, 7:57 AM

Sure we can handle it, renaming sounds good. As far as I can understand dropping the renamed table after some days seems the best option.

Sure we can handle it, renaming sounds good. As far as I can understand dropping the renamed table after some days seems the best option.

Yep, renaming it, it is a quick way of checking if something is really using it before it is too late. If nothing breaks in 2-3 days you can probably go ahead and drop the renamed tables.
If something breaks while renamed, it just needs to be renamed back to its original name

mforns set the point value for this task to 5.

Mentioned in SAL (#wikimedia-operations) [2017-08-01T12:04:49Z] <elukey> stop eventlogging_sync on analytics-slaves && rename all CookieBlock* tables (log db) to CookieBlock*_backup - T171883

Action executed on db1046 (m4-master):

MariaDB [(none)]> use log;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [log]> rename table CookieBlock_16046548 to CookieBlock_16046548_backup;
Query OK, 0 rows affected (0.11 sec)

MariaDB [log]> rename table CookieBlock_16241436 to CookieBlock_16241436_backup;
Query OK, 0 rows affected (0.02 sec)

MariaDB [log]> rename table CookieBlock_16241436_15423246 to CookieBlock_16241436_15423246_backup;
Query OK, 0 rows affected (0.02 sec)

MariaDB [log]> show tables like 'Cookie%';
+--------------------------------------+
| Tables_in_log (Cookie%)              |
+--------------------------------------+
| CookieBlock_16046548_backup          |
| CookieBlock_16241436_15423246_backup |
| CookieBlock_16241436_backup          |
+--------------------------------------+
3 rows in set (0.00 sec)

MariaDB [log]> select count(*) from CookieBlock_16046548_backup union select count(*) from CookieBlock_16241436_backup union select count(*) from CookieBlock_16241436_15423246_backup;
+----------+
| count(*) |
+----------+
|      310 |
| 15771712 |
|      988 |
+----------+

Same to db047 and dbstore1002. If nothing explodes during the next two days I'll drop them :)

Mentioned in SAL (#wikimedia-operations) [2017-08-03T13:28:43Z] <elukey> drop CookieBlock backup tables for T171883

elukey claimed this task.