Page MenuHomePhabricator
Feed Advanced Search

Jul 22 2019

Marostegui updated the task description for T226851: Drop abuse_filter_log.afl_log_id in production.
Jul 22 2019, 7:18 AM · AbuseFilter, DBA
Marostegui updated the task description for T196055: Remove table `math` from the database.
Jul 22 2019, 7:17 AM · Patch-For-Review, DBA, Math
Marostegui updated the task description for T196055: Remove table `math` from the database.
Jul 22 2019, 7:17 AM · Patch-For-Review, DBA, Math
Marostegui added a comment to T202367: Productionize dbproxy101[2-7].eqiad.wmnet and dbproxy200[1-4].

I have provisioned dbproxy2001 into m1 codfw - with notifications disabled as it is not an active proxy (or even service)

Jul 22 2019, 7:05 AM · Patch-For-Review, DBA
Marostegui updated the task description for T202367: Productionize dbproxy101[2-7].eqiad.wmnet and dbproxy200[1-4].
Jul 22 2019, 7:05 AM · Patch-For-Review, DBA
Marostegui added a comment to T196055: Remove table `math` from the database.

So, for the table drop what I will do will be.
Drop the table from a enwiki on a codfw replica (passive DC) to make sure there are no writes (if there are, replication will break and we'll know).
Will leave it for a few days, and if nothing breaks, I will rename the table on a eqiad (active DC) enwiki replica, and will monitor the error log to make sure nothing reads from it.
If there are also no issues, I will go ahead and start dropping it everywhere.

Jul 22 2019, 5:23 AM · Patch-For-Review, DBA, Math
Marostegui moved T228613: Re-build db2097 s1 and s6 from Triage to Backlog on the DBA board.
Jul 22 2019, 5:21 AM · DBA
Marostegui renamed T228613: Re-build db2097 s1 and s6 from Re-build db2097 s1 and s6 to Re-build db2097 s1 and s6 with Debian Buster and 10.3.
Jul 22 2019, 5:21 AM · DBA
Marostegui created T228613: Re-build db2097 s1 and s6.
Jul 22 2019, 5:20 AM · DBA
Marostegui closed T225378: db2097 (codfw s1&s6 source backups) mariadb@s6 *process* (10.1.39) crashed on 2019-06-08, a subtask of T206203: Implement database binary backups into the production infrastructure, as Resolved.
Jul 22 2019, 5:04 AM · Goal, DBA
Marostegui closed T225378: db2097 (codfw s1&s6 source backups) mariadb@s6 *process* (10.1.39) crashed on 2019-06-08 as Resolved.

As spoken, I am going to close this as the scope of the ticket is done.
I will create a new one to re-image this host with buster+10.3 and rebuild its data

Jul 22 2019, 5:04 AM · SRE, DBA

Jul 19 2019

Marostegui removed a watcher for Schema-change: Marostegui.
Jul 19 2019, 2:01 PM
Marostegui updated the task description for T222978: Compress and defragment tables on labsdb hosts.
Jul 19 2019, 1:15 PM · Data-Services, DBA
Marostegui removed a project from T147148: Wikipedia requires a patch to load its data from the dumps with mwdumper: DBA.
Jul 19 2019, 8:40 AM · Dumps-Generation, Utilities-mwdumper
Marostegui removed a project from T228360: Narrow scope of MediaWiki-Database workboard: DBA.
Jul 19 2019, 5:01 AM · Performance-Team (Radar), Project-Admins, Platform Engineering
Marostegui changed the status of T60674: Drop page.page_restrictions column from Wikimedia wikis, a subtask of T51188: [DO NOT USE] Schema changes for Wikimedia wikis (tracking) [superseded by #Blocked-on-schema-change], from Stalled to Open.
Jul 19 2019, 5:01 AM · DBA, Tracking-Neverending, Schema-change
Marostegui changed the status of T60674: Drop page.page_restrictions column from Wikimedia wikis from Stalled to Open.
Jul 19 2019, 5:01 AM · DBA, Schema-change-in-production
Marostegui moved T60674: Drop page.page_restrictions column from Wikimedia wikis from Triage to Backlog on the DBA board.
Jul 19 2019, 5:01 AM · DBA, Schema-change-in-production
Marostegui added a subtask for T54921: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking): T51195: Drop filejournal table from WMF.
Jul 19 2019, 4:58 AM · Epic, DBA, Tracking-Neverending
Marostegui added a parent task for T51195: Drop filejournal table from WMF: T54921: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking).
Jul 19 2019, 4:58 AM · MW-1.34-notes (1.34.0-wmf.20; 2019-08-27), DBA, Performance-Team (Radar), MediaWiki-File-management

Jul 18 2019

Marostegui added a project to T227166: decommission db1069: ops-eqiad.
Jul 18 2019, 3:05 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
Marostegui added a project to T227560: decommission db1065: ops-eqiad.
Jul 18 2019, 3:05 PM · ops-eqiad, SRE, DC-Ops, decommission-hardware
Marostegui added a project to T228281: decommission db2045.codfw.wmnet: ops-codfw.
Jul 18 2019, 3:04 PM · SRE, ops-codfw, DC-Ops, decommission-hardware
Marostegui added a comment to T228360: Narrow scope of MediaWiki-Database workboard.

So, from my side, this is the way I read those tags:

Jul 18 2019, 2:13 PM · Performance-Team (Radar), Project-Admins, Platform Engineering
Marostegui updated the task description for T226851: Drop abuse_filter_log.afl_log_id in production.
Jul 18 2019, 10:30 AM · AbuseFilter, DBA
Marostegui added a comment to T226851: Drop abuse_filter_log.afl_log_id in production.

I have altered db2116 for now, to make sure nothing writes to that column (if it does, it will break replication there, but won't impact the users). Will leave it for a few days before altering an active slave on eqiad (which is active and we can monitor if something reads from it for another few days).

root@db2116.codfw.wmnet[enwiki]> show create table abuse_filter_log\G
*************************** 1. row ***************************
       Table: abuse_filter_log
Create Table: CREATE TABLE `abuse_filter_log` (
  `afl_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `afl_filter` varbinary(64) NOT NULL DEFAULT '',
  `afl_user` bigint(20) unsigned NOT NULL DEFAULT '0',
  `afl_user_text` varbinary(255) NOT NULL DEFAULT '',
  `afl_ip` varbinary(255) NOT NULL DEFAULT '',
  `afl_action` varbinary(255) NOT NULL DEFAULT '',
  `afl_actions` varbinary(255) NOT NULL DEFAULT '',
  `afl_var_dump` blob NOT NULL,
  `afl_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `afl_namespace` int(11) NOT NULL,
  `afl_title` varbinary(255) NOT NULL DEFAULT '',
  `afl_wiki` varbinary(64) DEFAULT NULL,
  `afl_deleted` tinyint(1) NOT NULL DEFAULT '0',
  `afl_patrolled_by` int(10) unsigned NOT NULL DEFAULT '0',
  `afl_rev_id` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`afl_id`),
  KEY `afl_timestamp` (`afl_timestamp`),
  KEY `afl_rev_id` (`afl_rev_id`),
  KEY `user_timestamp` (`afl_user`,`afl_user_text`,`afl_timestamp`),
  KEY `filter_timestamp` (`afl_filter`,`afl_timestamp`),
  KEY `page_timestamp` (`afl_namespace`,`afl_title`,`afl_timestamp`),
  KEY `ip_timestamp` (`afl_ip`,`afl_timestamp`),
  KEY `wiki_timestamp` (`afl_wiki`,`afl_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=24431867 DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8
1 row in set (0.04 sec)
Jul 18 2019, 10:30 AM · AbuseFilter, DBA
Marostegui updated the task description for T226851: Drop abuse_filter_log.afl_log_id in production.
Jul 18 2019, 5:48 AM · AbuseFilter, DBA
Marostegui updated the task description for T208323: Predictive failures on disk S.M.A.R.T. status.
Jul 18 2019, 5:42 AM · SRE, DBA
Marostegui updated the task description for T228258: Decommission db2043-db2070.
Jul 18 2019, 5:25 AM · SRE, DBA
Marostegui reassigned T228281: decommission db2045.codfw.wmnet from Marostegui to RobH.

This host is ready for DC-Ops to start their decommissioning steps

Jul 18 2019, 5:25 AM · SRE, ops-codfw, DC-Ops, decommission-hardware
Marostegui updated the task description for T228281: decommission db2045.codfw.wmnet.
Jul 18 2019, 5:24 AM · SRE, ops-codfw, DC-Ops, decommission-hardware
Marostegui closed T225131: (OoW) Degraded RAID on es2003, a subtask of T222592: Decommission es2001, es2002, es2003, es2004, as Resolved.
Jul 18 2019, 5:02 AM · DC-Ops, ops-codfw, SRE, decommission-hardware
Marostegui closed T225131: (OoW) Degraded RAID on es2003 as Resolved.

All good - thanks!

root@es2003:/usr/local/lib/nagios/plugins# megacli -LDPDInfo -aAll
Jul 18 2019, 5:02 AM · SRE, ops-codfw
Marostegui closed T227829: Degraded RAID on db2044 as Resolved.

All good - thanks @Papaul!

root@db2044:~# hpssacli controller all show config
Jul 18 2019, 5:01 AM · SRE, ops-codfw

Jul 17 2019

Marostegui added a comment to T227829: Degraded RAID on db2044.

Thanks - I can see it rebuilding:

physicaldrive 1I:1:2 (port 1I:box 1:bay 2, SAS, 600 GB, Rebuilding)
Jul 17 2019, 2:59 PM · SRE, ops-codfw
Marostegui moved T228281: decommission db2045.codfw.wmnet from Triage to In progress on the DBA board.
Jul 17 2019, 2:01 PM · SRE, ops-codfw, DC-Ops, decommission-hardware
Marostegui updated the task description for T228258: Decommission db2043-db2070.
Jul 17 2019, 1:58 PM · SRE, DBA
Marostegui updated the task description for T228258: Decommission db2043-db2070.
Jul 17 2019, 1:57 PM · SRE, DBA
Marostegui added a parent task for T228281: decommission db2045.codfw.wmnet: T228258: Decommission db2043-db2070.
Jul 17 2019, 1:57 PM · SRE, ops-codfw, DC-Ops, decommission-hardware
Marostegui added a subtask for T228258: Decommission db2043-db2070: T228281: decommission db2045.codfw.wmnet.
Jul 17 2019, 1:57 PM · SRE, DBA
Marostegui closed T227862: (OoW) db2045 failed battery, a subtask of T228258: Decommission db2043-db2070, as Declined.
Jul 17 2019, 1:57 PM · SRE, DBA
Marostegui closed T227862: (OoW) db2045 failed battery as Declined.

Going to close this ticket as I have created the decommission one: T228281: decommission db2045.codfw.wmnet

Jul 17 2019, 1:57 PM · ops-codfw, SRE, DBA
Marostegui created T228281: decommission db2045.codfw.wmnet.
Jul 17 2019, 1:56 PM · SRE, ops-codfw, DC-Ops, decommission-hardware
Marostegui added a comment to T227862: (OoW) db2045 failed battery.

No point on spending time with this old host, I will start its decommissioning process.

Jul 17 2019, 9:22 AM · ops-codfw, SRE, DBA
Marostegui added a parent task for T228258: Decommission db2043-db2070: T208323: Predictive failures on disk S.M.A.R.T. status.
Jul 17 2019, 9:17 AM · SRE, DBA
Marostegui added a subtask for T208323: Predictive failures on disk S.M.A.R.T. status: T228258: Decommission db2043-db2070.
Jul 17 2019, 9:17 AM · SRE, DBA
Marostegui added a parent task for T227862: (OoW) db2045 failed battery: T228258: Decommission db2043-db2070.
Jul 17 2019, 9:16 AM · ops-codfw, SRE, DBA
Marostegui added a subtask for T228258: Decommission db2043-db2070: T227862: (OoW) db2045 failed battery.
Jul 17 2019, 9:16 AM · SRE, DBA
Marostegui triaged T228258: Decommission db2043-db2070 as Medium priority.
Jul 17 2019, 9:16 AM · SRE, DBA
Marostegui created T228258: Decommission db2043-db2070.
Jul 17 2019, 9:16 AM · SRE, DBA
Marostegui moved T228243: Switchover m3 (phabricator) master db1072 to db1128 from Triage to In progress on the DBA board.
Jul 17 2019, 8:39 AM · User-notice-archive, Phabricator, SRE, DBA
Marostegui added a comment to T228243: Switchover m3 (phabricator) master db1072 to db1128.

Window reserved on the deployments page: https://wikitech.wikimedia.org/w/index.php?title=Deployments&type=revision&diff=1832674&oldid=1832612
Email sent to ops and to wikitech-l: https://lists.wikimedia.org/pipermail/wikitech-l/2019-July/092308.html

Jul 17 2019, 8:39 AM · User-notice-archive, Phabricator, SRE, DBA
Marostegui added a comment to T227829: Degraded RAID on db2044.

We should have a bunch of disks from the decommissioned hosts, no?

Jul 17 2019, 8:38 AM · SRE, ops-codfw
Marostegui added a comment to T227829: Degraded RAID on db2044.

Let's replace with an USED one for now, that host will go away "soonish"

Jul 17 2019, 7:56 AM · SRE, ops-codfw
Marostegui added a parent task for T227717: Drop DB tables for now-deleted zerowiki from production: T54921: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking).
Jul 17 2019, 6:59 AM · Release-Engineering-Team-TODO, DBA, Product-Infrastructure-Team-Backlog-Deprecated
Marostegui added a subtask for T54921: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking): T227717: Drop DB tables for now-deleted zerowiki from production.
Jul 17 2019, 6:58 AM · Epic, DBA, Tracking-Neverending
Marostegui claimed T226851: Drop abuse_filter_log.afl_log_id in production.

Thanks for confirming. I am removing the Schema-change-in-production tag as there is nothing blocked on this removal (please correct me if I am wrong). So this is part of our our clean up backlog.
What I will do is rename the column on an enwiki host and leave it for a few days to make sure nothing really uses it.

Jul 17 2019, 6:58 AM · AbuseFilter, DBA
Marostegui added a comment to T143896: MySQL metrics monitoring.

Great work, a lot less files to edit when provisioning/moving/decommissioning hosts which were very error prone!
Thanks :)

Jul 17 2019, 6:43 AM · Data-Persistence, observability, Patch-For-Review, SRE, Prometheus-metrics-monitoring
Marostegui triaged T228243: Switchover m3 (phabricator) master db1072 to db1128 as Medium priority.
Jul 17 2019, 6:00 AM · User-notice-archive, Phabricator, SRE, DBA
Marostegui created T228243: Switchover m3 (phabricator) master db1072 to db1128.
Jul 17 2019, 5:59 AM · User-notice-archive, Phabricator, SRE, DBA
Marostegui updated the task description for T222978: Compress and defragment tables on labsdb hosts.
Jul 17 2019, 5:29 AM · Data-Services, DBA
Marostegui updated the task description for T217396: Decommission db1061-db1073.
Jul 17 2019, 5:28 AM · SRE, DBA
Marostegui added a comment to T227560: decommission db1065.

This host is ready for DC-Ops to decommission.

Jul 17 2019, 5:27 AM · ops-eqiad, SRE, DC-Ops, decommission-hardware
Marostegui reassigned T227560: decommission db1065 from Marostegui to RobH.
Jul 17 2019, 5:27 AM · ops-eqiad, SRE, DC-Ops, decommission-hardware
Marostegui updated the task description for T227560: decommission db1065.
Jul 17 2019, 5:24 AM · ops-eqiad, SRE, DC-Ops, decommission-hardware
Marostegui added a comment to T226050: Wiki Replicas are very slow and timing out.

Glad to hear @MusikAnimal - we are trying a different approach whilst still compressing tables, which requires less depooling time. We will, still, however, require depooling once it is time for the biggest wikis to be compressed (enwiki, commons, wikidata..), but hopefully hours instead of days :)

Jul 17 2019, 5:00 AM · Data-Services

Jul 10 2019

Marostegui moved T220002: Decommission dbstore2001.codfw.wmnet and dbstore2002.codfw.wmnet from Blocked external/Not db team to Done on the DBA board.
Jul 10 2019, 5:07 AM · Patch-For-Review, SRE, ops-codfw, DC-Ops, decommission-hardware

Jul 9 2019

Elitre awarded T227063: Database primary master failover on s8 (wikidatawiki) a Love token.
Jul 9 2019, 12:38 PM · User-notice-archive, User-Johan, MoveComms-Support (Jul-Sep-2019), Wikidata
Marostegui updated the task description for T208323: Predictive failures on disk S.M.A.R.T. status.
Jul 9 2019, 10:05 AM · SRE, DBA
Marostegui updated the task description for T208323: Predictive failures on disk S.M.A.R.T. status.
Jul 9 2019, 10:04 AM · SRE, DBA
Marostegui added a comment to T225169: [4 hours] Investigate whether it's efficient to order by tag value (DBA input requested).

We've discussed this (and the wider implication of the entire feature) in the Engineering meeting, and agreed it's time to revisit whether the product is worth the significant effort here. We tried to estimate, generally, how long things may take.

Here is our general estimate of what it would take, in general, to be able to store pageviews in the database and sort the result by them, erring on the side of caution:

  • Add indexed integer column to the PageTriage table. Since the table isn't extremely large, it will probably not take too long but it does requires DBA review and assistance. Estimated time: A couple of weeks (with some risk of this being months)
Jul 9 2019, 9:32 AM · Community-Tech (Kanban (Q1 2019-20)), Spike, PageTriage, Growth-Team
Marostegui updated the task description for T227565: decommission db2038.
Jul 9 2019, 8:59 AM · Patch-For-Review, SRE, ops-codfw, DC-Ops, decommission-hardware
Marostegui updated the task description for T221533: Decommission old coredb machines (<=db2042).
Jul 9 2019, 8:48 AM · DBA
Marostegui renamed T227565: decommission db2038 from decommission db2035 to decommission db2038.
Jul 9 2019, 8:48 AM · Patch-For-Review, SRE, ops-codfw, DC-Ops, decommission-hardware
Marostegui updated the task description for T221533: Decommission old coredb machines (<=db2042).
Jul 9 2019, 8:47 AM · DBA
Marostegui created T227565: decommission db2038.
Jul 9 2019, 8:47 AM · Patch-For-Review, SRE, ops-codfw, DC-Ops, decommission-hardware
Marostegui updated the task description for T217396: Decommission db1061-db1073.
Jul 9 2019, 8:46 AM · SRE, DBA
Marostegui updated the task description for T227560: decommission db1065.
Jul 9 2019, 8:04 AM · ops-eqiad, SRE, DC-Ops, decommission-hardware
Marostegui added a comment to T227560: decommission db1065.

Let's wait a few days before actually starting to decommission it.
I have disabled notifications though

Jul 9 2019, 8:03 AM · ops-eqiad, SRE, DC-Ops, decommission-hardware
Marostegui moved T227560: decommission db1065 from Triage to In progress on the DBA board.
Jul 9 2019, 8:02 AM · ops-eqiad, SRE, DC-Ops, decommission-hardware
Marostegui created T227560: decommission db1065.
Jul 9 2019, 8:02 AM · ops-eqiad, SRE, DC-Ops, decommission-hardware
Marostegui updated the task description for T222978: Compress and defragment tables on labsdb hosts.
Jul 9 2019, 7:58 AM · Data-Services, DBA
Marostegui closed T226952: Failover m2 master db1065 to db1132, a subtask of T217396: Decommission db1061-db1073, as Resolved.
Jul 9 2019, 7:30 AM · SRE, DBA
Marostegui closed T226952: Failover m2 master db1065 to db1132, a subtask of T220170: Address Database hardware infrastructure blockers on datacenter switchover & multi-dc deployment, as Resolved.
Jul 9 2019, 7:30 AM · Goal, DBA
Marostegui closed T226952: Failover m2 master db1065 to db1132 as Resolved.
Jul 9 2019, 7:30 AM · SRE-tools, Znuny, Recommendation-API, SRE, DBA
Marostegui added a comment to T226952: Failover m2 master db1065 to db1132.

This was done successfully.

Jul 9 2019, 6:08 AM · SRE-tools, Znuny, Recommendation-API, SRE, DBA
Marostegui added a comment to P8728 (An Untitled Masterwork).
root@cumin1001:/home/marostegui# mysql.py -hdb1117:3322 -e "show slave status\G"
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: db1065.eqiad.wmnet
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: db1065-bin.000251
          Read_Master_Log_Pos: 437347077
               Relay_Log_File: db1117-relay-bin.000002
                Relay_Log_Pos: 1278
        Relay_Master_Log_File: db1065-bin.000251
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 436828506
              Relay_Log_Space: 520148
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 171978772
               Master_SSL_Crl:
           Master_SSL_Crlpath:
                   Using_Gtid: No
                  Gtid_IO_Pos: 0-171970569-1006906062,171970636-171970636-23122305,171970569-171970569-156638323,171978772-171978772-139561525
      Replicate_Do_Domain_Ids:
  Replicate_Ignore_Domain_Ids:
                Parallel_Mode: conservative
Jul 9 2019, 5:33 AM
Marostegui created P8728 (An Untitled Masterwork).
Jul 9 2019, 5:28 AM
Marostegui added a comment to T227552: pc2010 possibly broken memory.

@Papaul and myself chatted about this and the plan is to:

  • Clear logs (I just did)
  • Upgrade firmware, BIOS etc
  • Leave this task open for a week to see if it happens again and if not close it for now.
Jul 9 2019, 5:23 AM · SRE, ops-codfw, DBA
Marostegui added a comment to T227552: pc2010 possibly broken memory.

As per my chat with @Papaul I rebooted the host a second time and the previous error didn't show up.

Jul 9 2019, 5:17 AM · SRE, ops-codfw, DBA
Marostegui added a comment to T226952: Failover m2 master db1065 to db1132.

Mentioned in SAL (#wikimedia-operations) [2019-07-09T05:13:17Z] <marostegui> Rebooting pc2010 for a second time as per papaul's suggestion T226952

Jul 9 2019, 5:14 AM · SRE-tools, Znuny, Recommendation-API, SRE, DBA
Marostegui triaged T227552: pc2010 possibly broken memory as Medium priority.
Jul 9 2019, 5:06 AM · SRE, ops-codfw, DBA
Marostegui created T227552: pc2010 possibly broken memory.
Jul 9 2019, 5:05 AM · SRE, ops-codfw, DBA

Jul 8 2019

Marostegui created P8724 (An Untitled Masterwork).
Jul 8 2019, 3:13 PM
Marostegui added a comment to T227062: Failover s8 (wikidatawiki) db primary master db1071 to db1104 (read-only required).

I have restarted db1109 to pickup STATEMENT as a binlog format. db1109 will be the candidate master once db1104 (current candidate master) gets promoted to master.

Jul 8 2019, 5:48 AM · DBA
Marostegui added a comment to T222978: Compress and defragment tables on labsdb hosts.

After this big batch of wiki compression only 3555 tables were left to be compressed - I am now trying to compress medium size wikis, between 20G and 100GB (a total of 3000 tables). If this goes fine, only just the bigger wikis (just one table for enwiki, ruwiki, wikidata) would be left to be compressed, so only depooling for them, which would reduce the amount of days that we'd need to have 1009 depooled and the service won't be as much degraded.
Will report back once this new batch is done.

Jul 8 2019, 5:34 AM · Data-Services, DBA
Marostegui closed T57385: Investigate dropping "edit_page_tracking" database table from Wikimedia wikis after archiving it, a subtask of T54921: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking), as Resolved.
Jul 8 2019, 5:23 AM · Epic, DBA, Tracking-Neverending
Marostegui closed T57385: Investigate dropping "edit_page_tracking" database table from Wikimedia wikis after archiving it as Resolved.

Um, it has? I just found it on meta, though empty.

wikiadmin@10.64.48.153(metawiki)> select * from edit_page_tracking;
Empty set (0.00 sec)
Jul 8 2019, 5:23 AM · SRE, DBA
Marostegui removed a project from T184615: Once MCR is deployed, drop the rev_text_id, rev_content_model, and rev_content_format fields from the revision table: DBA.

I am going to remove the DBA tag from here as there is nothing for us to do yet.
Once this is ready to go, please follow the template to create a schema change request and we'll take care of it: https://wikitech.wikimedia.org/wiki/Schema_changes#Workflow_of_a_schema_change

Jul 8 2019, 4:51 AM · Analytics-Radar, Platform Team Initiatives (MCR), Multi-Content-Revisions (Tech Debt), Schema-change

Jul 5 2019

Marostegui added a comment to T222978: Compress and defragment tables on labsdb hosts.

In order to cause less disruption to the service I am trying a different approach with labsdb1009.
I am compressing around 50k tables from almost 700 wikis which are smaller than 10GB size without depooling the host so the load won't be as higher as we've seen on the other hosts. And we'll need to depool the hosts only for the big big wikis.
Those tables are small enough that replication isn't a problem and the tables are compressed very very fast that metadata locking isn't an issue either (also because those wikis aren't that used).

Jul 5 2019, 1:00 PM · Data-Services, DBA