Page MenuHomePhabricator

Drop eu_touched in production
Closed, ResolvedPublic

Description

Schema change progress:

  • wikitech - doesn't apply, this table doesn't exist there.
    • labswiki
    •  labtestwiki

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

s2 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1122
  • db1105
  • db1103
  • db1090
  • db1076
  • db1074
  • db1066

Change 446817 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1103:3312

https://gerrit.wikimedia.org/r/446817

Change 446817 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1103:3312

https://gerrit.wikimedia.org/r/446817

Mentioned in SAL (#wikimedia-operations) [2018-07-19T13:43:27Z] <marostegui> Deploy schema change on db1103:3312 T144010 T51190 T199368

Change 446830 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1105:3312

https://gerrit.wikimedia.org/r/446830

Change 446830 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1105:3312

https://gerrit.wikimedia.org/r/446830

Mentioned in SAL (#wikimedia-operations) [2018-07-19T14:30:31Z] <marostegui> Deploy schema change on db1105:3312 T144010 T51190 T199368

Change 447020 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1076

https://gerrit.wikimedia.org/r/447020

Change 447020 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1076

https://gerrit.wikimedia.org/r/447020

Mentioned in SAL (#wikimedia-operations) [2018-07-20T05:49:30Z] <marostegui> Deploy schema change on db1076 T144010 T51190 T199368

Change 447026 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1074

https://gerrit.wikimedia.org/r/447026

Change 447026 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1074

https://gerrit.wikimedia.org/r/447026

Mentioned in SAL (#wikimedia-operations) [2018-07-20T06:13:08Z] <marostegui> Deploy schema change on db1074 with replication, this will generate lag on labsdb:s2 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-23T04:42:37Z] <marostegui> Deploy schema change on db1061 (s6 primary master) T144010 T51190 T199368

Change 447363 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1122 for alter table

https://gerrit.wikimedia.org/r/447363

Change 447363 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1122 for alter table

https://gerrit.wikimedia.org/r/447363

Mentioned in SAL (#wikimedia-operations) [2018-07-23T07:33:52Z] <marostegui> Deploy schema change on db1122 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-23T13:23:25Z] <marostegui> Deploy schema change on labswiki and labstestwiki T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-23T13:36:43Z] <marostegui> Deploy schema change on s4 codfw master (db2051), this will generate lag on s4 codfw T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-24T04:44:49Z] <marostegui> Deploy schema change on db1066 (s2 primary master) T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-24T05:05:55Z] <marostegui> Deploy schema change on db1081 T144010 T51190 T199368

s4 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1121
  • db1103
  • db1097
  • db1091
  • db1084
  • db1081
  • db1068

Mentioned in SAL (#wikimedia-operations) [2018-07-24T06:32:07Z] <marostegui> Deploy schema change on dbstore1002:s4 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-24T09:17:56Z] <marostegui> Deploy schema change on db1097:3314 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-24T13:40:34Z] <marostegui> Deploy schema change on db1081 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-24T13:43:34Z] <marostegui> Deploy schema change on db1084 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-24T14:09:29Z] <marostegui> Deploy schema change on db1103:3314 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-25T07:12:08Z] <marostegui> Deploy schema change on db1091 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-25T08:14:31Z] <marostegui> Deploy schema change on db1121 with replication, this will generate lag on labsdb hosts for s4 T144010 T51190 T199368

This table fails to get altered when there is high concurrency (I have had to stop replication on s4 slaves to be able to alter it), so I am not going to even attempt to alter it on the master. I will do that once eqiad is passive.

They were failing with duplicate key (which didn't exist really) so I think it was a matter of high writes coming in and maybe race condition when the alter was finishing and doing the final r

That is my theory seeing that: it was failing with different keys every time and once replication was stopped, it would work straightaway

Mentioned in SAL (#wikimedia-operations) [2018-07-25T10:07:08Z] <marostegui> Deploy schema change on db2040 (s7 codfw master) with replication, this will generate lag on s7 codfw T144010 T51190 T199368

s7 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1101
  • db1098
  • db1094
  • db1090
  • db1086
  • db1079
  • db1062

Mentioned in SAL (#wikimedia-operations) [2018-07-25T14:45:09Z] <marostegui> Deploy schema change on db1098:3317 T144010 T51190 T199368

@Marostegui Hi, could you please confirm the following: no other changes in the wbc_entity_usage schema will take place except for dropping the eu_touched field?

The WDCM operations rely on weekly Apache Sqoop runs across the wbc_entity_usage tables from stat1004. These runs produce the necessary Hive tables from which we perform ETL for Wikidata usage statistics. It is necessary for the WDCM's consistent operation to let me know immediately if and when any changes in the wbc_entity_usage will take place, simply because I will need to adjust the related Sqoop/R scripts accordingly.

Thanks a lot!

There is no other schema change related to wbc_entity_usage on-going apart from this, yes :-)

@Marostegui Thank you very much! Can you estimate when would the new schema become operational?

@Marostegui Thank you very much! Can you estimate when would the new schema become operational?

That's hard to say, you can track the progress on the task description, but it will take a few more weeks at least.

Mentioned in SAL (#wikimedia-operations) [2018-07-26T05:02:40Z] <marostegui> Deploy schema change on db1101:3317 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-26T06:47:27Z] <marostegui> Deploy schema change on db1086 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-26T08:20:00Z] <marostegui> Deploy schema change on db1090:3317 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-26T09:22:53Z] <marostegui> Deploy schema change on db1079 with replication, this will generate lag on labsdb:s7 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-27T04:59:17Z] <marostegui> Deploy schema change on db1094 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-30T05:00:28Z] <marostegui> Deploy schema change on db1062 (s7 primary master) T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-30T07:32:15Z] <marostegui> Deploy schema change on db2045 (s8 codfw master) this will generate lag on s8 codfw T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-07-30T08:49:36Z] <marostegui> Deploy schema change on dbstore1002:s8 T144010 T51190 T199368

s8 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1124
  • db1109
  • db1104
  • db1101
  • db1099
  • db1092
  • db1087
  • db1071

Mentioned in SAL (#wikimedia-operations) [2018-07-31T05:00:04Z] <marostegui> Deploy schema change on dbstore1002:s8 T144010 T51190 T199368/script unload irssinotifier

Mentioned in SAL (#wikimedia-operations) [2018-07-31T05:01:05Z] <marostegui> Deploy schema change on db1099:3318 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-01T04:46:05Z] <marostegui> Deploy schema change on db1101:3318 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-01T06:47:53Z] <marostegui> Deploy schema change on db1087 with replication, this will generate lag on labs:s8 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-01T08:37:24Z] <marostegui> Deploy schema change on db1109:3318 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-01T09:19:03Z] <marostegui> Deploy schema change on db1092 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-01T09:50:15Z] <marostegui> Deploy schema change on db2048 (s1 codfw master) with replication, this will generate lag on codfw s1 T144010 T51190 T199368

s1 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • dbstore1001
  • db1124
  • db1119
  • db1118 (test host)
  • db1114
  • db1106
  • db1105
  • db1102 (test host)
  • db1099
  • db1095 (test host)
  • db1089
  • db1083
  • db1080
  • db1067

Mentioned in SAL (#wikimedia-operations) [2018-08-01T12:16:01Z] <marostegui> Deploy schema change on db1099:3311 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-01T12:35:20Z] <marostegui> Deploy schema change on db1105:3311 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-01T13:28:57Z] <marostegui> Deploy schema change on db1083 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-02T06:18:11Z] <marostegui> Deploy schema change on db1106 with replication, this will generate lag on labsdb:s1 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-02T08:14:50Z] <marostegui> Deploy schema change on db2043 (s3 codfw master) this will generate lag on codfw:s3 T144010 T51190 T199368

s3 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1124
  • db1123
  • db1078
  • db1077
  • db1075

Mentioned in SAL (#wikimedia-operations) [2018-08-06T06:49:22Z] <marostegui> Deploy schema change on db1077 with replication, this will generate lag on labsdb:s3 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-07T05:21:26Z] <marostegui> Deploy schema change on db1123 T144010 T51190 T199368

Mentioned in SAL (#wikimedia-operations) [2018-08-07T06:58:11Z] <marostegui> Deploy schema change on db1075 (s3 master) T144010 T51190 T199368

Marostegui changed the task status from Open to Stalled.Aug 7 2018, 7:36 AM
Marostegui updated the task description. (Show Details)
Marostegui moved this task from In progress to Blocked external/Not db team on the DBA board.

Blocking this until we have done DC failover so we can do s4 primary master (T144010#4449908)

@Marostegui I see that in enwiki there is no more eu_touched in the wbc_entity_usage table.

This made one of my crucial scripts running from stat1004 statbox crash, and the whole WDCM system is thus not being updated.

Could you please confirm that these changes are in effect across all project databases that maintain the wbc_entity_usage tables?

Thanks a lot!

@Marostegui I see that in enwiki there is no more eu_touched in the wbc_entity_usage table.

This made one of my crucial scripts running from stat1004 statbox crash, and the whole WDCM system is thus not being updated.

Could you please confirm that these changes are in effect across all project databases that maintain the wbc_entity_usage tables?

Thanks a lot!

Hello,

Everything is done except s4 (commons) master as specified on the description of the task.

Mentioned in SAL (#wikimedia-operations) [2018-09-13T08:16:31Z] <marostegui> Stop replication on s4 eqiad master (db1068) and deploy a schema change - this will generate lag on s4 eqiad - T144010

Marostegui changed the task status from Stalled to Open.Sep 13 2018, 8:17 AM
Marostegui raised the priority of this task from Low to Medium.
Marostegui updated the task description. (Show Details)

All done