Drop externallinks.el_from_namespace on wmf databases
Closed, ResolvedPublic

Description

After abandon of patch set https://gerrit.wikimedia.org/r/#/c/163470/ the schema change from T86415 should be rollbacked.

So please drop the column externallinks.el_from_namespace on all wmf wikis to bring the schema back to the current tables.sql. Thanks.

Feel free to WONTFIX this, if you think that the column should be part of the schema, because remove would also take cpu time, but it could free some space (for the column and for index).

ALTER to run: ALTER TABLE externallinks DROP COLUMN el_from_namespace, DROP KEY el_backlinks_to;

Progress:

s1

  • eqiad
  • codfw

s2

s3

s4

s5

s6

s7

s8

wikitech

  • eqiad
  • codfw
There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 29 2015, 2:51 PM

Ran into this one. Take for example https://commons.wikimedia.org/wiki/File:Armor_of_Emperor_Ferdinand_I_(1503%E2%80%931564)_MET_DP-12881-008.jpg . It incorrectly returns el_from_namespace=0 instead of 6:

MariaDB [commonswiki_p]> SELECT * FROM externallinks WHERE el_to LIKE 'http://www.metmuseum.org/art/collection/search/%' AND el_from=61213894 LIMIT 1;
+-----------+----------+-------------------+------------------------------------------------------+-------------------------------------------------------+

el_idel_fromel_from_namespaceel_toel_index

+-----------+----------+-------------------+------------------------------------------------------+-------------------------------------------------------+

+-----------+----------+-------------------+------------------------------------------------------+-------------------------------------------------------+
1 row in set (0.00 sec)

jcrespo moved this task from Triage to Backlog on the DBA board.
jcrespo added subscribers: Marostegui, jcrespo.

Removing blocked because as far as I can see, this is not blocking anything or anyone (people should just not use the column until it is deleted). This doesn't mean it shouldn't be done, just that it is not urgent- it is still a DBA task, but of "clean up" level of priority.

The fact that this was deployed to production but not merged reinforces the policy of only deploying already merged patches. CC @Marostegui

Marostegui moved this task from Backlog to Next on the DBA board.Jul 25 2018, 10:25 AM

This also includes the drop of KEY el_backlinks_to (el_from_namespace,el_to(60),el_from), which is not being used really:

s1:

root@db1089.eqiad.wmnet[sys]> select * from schema_unused_indexes where index_name='el_backlinks_to';
+---------------+---------------+-----------------+
| object_schema | object_name   | index_name      |
+---------------+---------------+-----------------+
| enwiki        | externallinks | el_backlinks_to |
+---------------+---------------+-----------------+
1 row in set (0.06 sec)

s4:

root@db1084.eqiad.wmnet[(none)]> select * from sys.schema_unused_indexes where index_name='el_backlinks_to';
+---------------+---------------+-----------------+
| object_schema | object_name   | index_name      |
+---------------+---------------+-----------------+
| commonswiki   | externallinks | el_backlinks_to |
+---------------+---------------+-----------------+
1 row in set (0.06 sec)

s8:

root@db1087.eqiad.wmnet[(none)]> select * from sys.schema_unused_indexes where index_name='el_backlinks_to';
+---------------+---------------+-----------------+
| object_schema | object_name   | index_name      |
+---------------+---------------+-----------------+
| wikidatawiki  | externallinks | el_backlinks_to |
+---------------+---------------+-----------------+
1 row in set (0.04 sec)

Ideally one slave will have all the stuff dropped and will be left for a few days to make sure no queries have any regression.

Marostegui updated the task description. (Show Details)Aug 7 2018, 8:38 AM
Marostegui moved this task from Next to In progress on the DBA board.Aug 7 2018, 8:43 AM
Stashbot added a subscriber: Stashbot.

Mentioned in SAL (#wikimedia-operations) [2018-08-07T09:57:44Z] <marostegui> Deploy schema change on db2075 - T67448 T114117 T5119

Marostegui added a comment.EditedAug 7 2018, 10:19 AM

I have removed the column from db2075 (codfw - s5) and I am going to leave it like that for a few days, to make sure nothing writes to that column. If that happens replication will break and we'll know.

After that, probably an eqiad host will be altered so we can see if any query shows some regressions.

Marostegui added a comment.EditedAug 20 2018, 8:35 AM
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore2001
  • dbstore1002
  • db2094
  • db2089
  • db2084
  • db2075 T114117#4483970
  • db2066
  • db2059
  • db2052
  • db2038
  • db1124
  • db1113
  • db1110
  • db1102
  • db1100
  • db1097
  • db1096
  • db1082
  • db1070

I have altered db1100 and I will leave it there and check if there are any regressions on queries as that host receives reads

IKhitron added a subscriber: IKhitron.
IKhitron added a comment.EditedAug 22 2018, 12:52 AM

I wasn't sure why query results looks weird, so I eventually found this. Maybe you should set this number to -1 or something in all wikis for now, I don't know.

Marostegui added a comment.EditedAug 22 2018, 1:48 AM

I wasn't sure why query results looks weird, so I eventually found this. Maybe younshould set this number to -1 or something in all wikis for now, I don't know.

Can you elaborate a bit more on this? I guess you are talking about the same thing as T114117#3492177 ?
What issue are you seeing?
What do you mean with -1 on all wikis?

IKhitron added a comment.EditedAug 22 2018, 10:38 AM

I did not work with this table a lot of time, so I run a simple query to remember:

select * from externallinks limit 10

to see the schema.
I saw that el_from_namespace was 0, seems like all 10 examples where articles. So I used this field in a new query, something like

#...
and el_from_namespace in (0, 6, 10, 14, 100, 118)
#...

Run the query, get the results. Everything looks ok. Exported to pagepile, started to work. And only then saw links to talk namespaces. If the value of this field was always -1, and not unfortunately fitting one of actually existing namespaces, I would (1) saw it in the first query and understood it's not working, and (2), the second query was empty in place of bringing wrong results.
Summary: for now, all entries marked as they were in articles namespace, and it's wrong.

Thanks for clarifying it!
So that problem will be gone once this task is completed and the column gets dropped everywhere :-)

Absolutely. But looks like it takes years, so I suggested something simple until then.

Absolutely. But looks like it takes years, so I suggested something simple until then.

We have it started now at least and should be completed in some weeks hopefully :-)
We have so many tasks that it
You approach could work, but we try to avoid updating things directly on the database and we'd prefer to use a script that uses MW framework, to make sure it is consistent and safe, and it is probably not worth the time anymore that we have started and almost dropped it in a whole section (s5).

Thanks again!

Mentioned in SAL (#wikimedia-operations) [2018-08-22T12:15:37Z] <marostegui> Deploy schema change on db1070 (s5 primary master) - T67448 T114117 T51191

Marostegui updated the task description. (Show Details)Aug 22 2018, 12:54 PM
Marostegui updated the task description. (Show Details)EditedAug 22 2018, 2:21 PM

s6 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1113
  • db1098
  • db1096
  • db1093
  • db1088
  • db1085
  • db1061
Marostegui updated the task description. (Show Details)Aug 23 2018, 4:46 AM
Marostegui updated the task description. (Show Details)Aug 23 2018, 10:54 AM
Marostegui added a comment.EditedAug 23 2018, 11:02 AM

s2 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1122
  • db1105
  • db1103
  • db1095
  • db1090
  • db1076
  • db1074
  • db1066
Marostegui updated the task description. (Show Details)Aug 23 2018, 11:58 AM
Bstorm added subscribers: Krenair, Bstorm.EditedAug 23 2018, 12:32 PM

This has broken views from dewiki (as an IRC user noticed this morning). That table is a full view, so when a column drops, the view is broken (and I confirmed the table is the same from s5-master through to the labsdb replicas). So if they will replicate down like that (and any s5 dbs now have the view broken I imagine?), then I can null or drop the column in a custom view (perhaps temporarily), or I can just do a rerun every time a shard is going to replicate the change down.

This has broken views from dewiki (as an IRC user noticed this morning). That table is a full view, so when a column drops, the view is broken (and I confirmed the table is the same from s5-master through to the labsdb replicas). So if they will replicate down like that (and any s5 dbs now have the view broken I imagine?), then I can null or drop the column in a custom view (perhaps temporarily), or I can just do a rerun every time a shard is going to replicate the change down.

So s6 is also probably broken as it was finished a few hours ago. So a re-run there is probably needed.
I would suggest I ping you everytime the change reaches labs and you can do a re-run and don't have to keep subscribed to this task?

That'd be great. I somehow worry that if I change the view, I'll forget about it later, even though it is a more straightforward thing.

Re-running that table on all replicas then.

ok I will ping you every time. I can also narrow it to the wikis affected, so you only have to run it for the given wikis.
Once s3 is reached (around 800 wikis) you can do a full re-run.

Up to you!

s5 - dewiki
s6 - frwiki jawiki ruwiki

A full re-run on just one table is fine when commonswiki isn't locked. rolls dice to see--yup, it is

I have to also do a full re-run on the table to change to a custom view anyway (which will require depooling if things are locked up). At least for now, I don't need to bother with that.

A full re-run on just one table is fine when commonswiki isn't locked. rolls dice to see--yup, it is

I have to also do a full re-run on the table to change to a custom view anyway (which will require depooling if things are locked up). At least for now, I don't need to bother with that.

Sure! You are the expert :)

@Bstorm s2 changes have been replicated to labs - whenever you can, please re-run the views there.
Thanks!

Marostegui updated the task description. (Show Details)EditedAug 24 2018, 3:05 PM

s8 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1124
  • db1109
  • db1104
  • db1101
  • db1099
  • db1092
  • db1087
  • db1071
Marostegui updated the task description. (Show Details)Aug 24 2018, 3:08 PM
Marostegui updated the task description. (Show Details)Aug 27 2018, 6:23 AM

@Bstorm s8 changes have been replicated to labs - whenever you can, please re-run the views there.
Thanks!

Marostegui added a comment.EditedAug 28 2018, 5:05 AM

s4 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1121
  • db1103
  • db1102
  • db1097
  • db1091
  • db1084
  • db1081
  • db1068
Marostegui updated the task description. (Show Details)Aug 28 2018, 5:05 AM
Marostegui updated the task description. (Show Details)Aug 28 2018, 7:43 AM

Mentioned in SAL (#wikimedia-operations) [2018-08-29T05:55:26Z] <marostegui> Deploy schema change on db1073 (labswiki) - T114117 T51191 T67448

Marostegui updated the task description. (Show Details)Aug 29 2018, 5:56 AM
Marostegui added a comment.EditedSep 3 2018, 1:19 PM

s7 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1101
  • db1098
  • db1094
  • db1090
  • db1086
  • db1079
  • db1062
Marostegui updated the task description. (Show Details)Sep 3 2018, 1:26 PM

@Bstorm these changes have been replicated to s4 and s7, please re-run the views when you get a chance. Thank you!

Marostegui updated the task description. (Show Details)EditedSep 4 2018, 12:39 PM

s3 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1124
  • db1123
  • db1095
  • db1078
  • db1077
  • db1075
Marostegui updated the task description. (Show Details)Sep 4 2018, 12:41 PM
Marostegui updated the task description. (Show Details)Sep 5 2018, 7:26 AM
Marostegui updated the task description. (Show Details)Sep 5 2018, 8:21 AM

@Bstorm s3 is finished. You can re-run the views.
Only s1 pending which most likely be done in a couple of weeks.

Thanks for the patience!

Done on my end.

Mentioned in SAL (#wikimedia-operations) [2018-09-14T07:49:20Z] <marostegui> Deploy schema change on s4 eqiad master (db1068) - T67448 T114117 T51191

Marostegui updated the task description. (Show Details)Sep 17 2018, 5:25 AM

Mentioned in SAL (#wikimedia-operations) [2018-09-17T05:28:00Z] <marostegui> Deploy schema change on s1 eqiad master (db1067) - T67448 T114117 T51191

Marostegui updated the task description. (Show Details)Sep 17 2018, 9:05 AM

@Bstorm can you run the views again? This is the last time, as s1 was the only pending wiki to alter in eqiad.
Thank you!

Done re-running that view.

Thanks! So if I understand correctly, externallinks isn't going anywhere, correct? Am I safe to start using it again on the replicas?

Yeah, it was just a column drop :-)

Marostegui moved this task from In progress to Next on the DBA board.Wed, Oct 24, 9:55 AM
Marostegui changed the task status from Open to Stalled.
Marostegui changed the task status from Stalled to Open.Mon, Nov 12, 9:09 AM

Mentioned in SAL (#wikimedia-operations) [2018-11-13T05:39:27Z] <marostegui> Deploy schema change on db2048 (s1 codfw master), this will create lag on s1 codfw - T114117

Marostegui updated the task description. (Show Details)Tue, Nov 13, 7:45 AM
Marostegui closed this task as Resolved.