Page MenuHomePhabricator
Feed Advanced Search

Feb 12 2019

Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 12 2019, 8:07 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T174802: Archive and drop education program (ep_*) tables on all wikis.
Feb 12 2019, 7:16 AM · User-notice-archive, Datasets-General-or-Unknown, Data-Services, DBA
Marostegui claimed T174802: Archive and drop education program (ep_*) tables on all wikis.
Feb 12 2019, 7:13 AM · User-notice-archive, Datasets-General-or-Unknown, Data-Services, DBA
Marostegui added a comment to T174802: Archive and drop education program (ep_*) tables on all wikis.

I have renamed all the tables on enwiki on db1089:

root@db1089.eqiad.wmnet[enwiki]> show tables like 'T174%';
+-----------------------------+
| Tables_in_enwiki (T174%)    |
+-----------------------------+
| T174802_ep_articles         |
| T174802_ep_cas              |
| T174802_ep_courses          |
| T174802_ep_events           |
| T174802_ep_instructors      |
| T174802_ep_oas              |
| T174802_ep_orgs             |
| T174802_ep_revisions        |
| T174802_ep_students         |
| T174802_ep_users_per_course |
+-----------------------------+
10 rows in set (0.00 sec)
Feb 12 2019, 7:12 AM · User-notice-archive, Datasets-General-or-Unknown, Data-Services, DBA
Marostegui updated the task description for T174802: Archive and drop education program (ep_*) tables on all wikis.
Feb 12 2019, 7:05 AM · User-notice-archive, Datasets-General-or-Unknown, Data-Services, DBA
Marostegui updated the task description for T208323: Predictive failures on disk S.M.A.R.T. status.
Feb 12 2019, 6:40 AM · SRE, DBA

Feb 11 2019

Marostegui updated subscribers of T89741: Expose ar_content_format and ar_content_model columns of archive table on Labs replicas.
Feb 11 2019, 8:46 PM · cloud-services-team (Kanban), WMF-Legal, Data-Services, DBA
Marostegui added a comment to T214840: db2085/db1106 don't boot with 4.9.0-8-amd64.

@paravoid gave us some food for thought:

stuck at "loading ramdisk" is sometimes an indication of misconfigured serial redirection after boot
basically when Linux and the BIOS are fighting over control of the serial port
Feb 11 2019, 5:46 PM · ops-codfw, Patch-For-Review, SRE, DBA
Marostegui updated the task description for T214840: db2085/db1106 don't boot with 4.9.0-8-amd64.
Feb 11 2019, 5:17 PM · ops-codfw, Patch-For-Review, SRE, DBA
Marostegui renamed T214840: db2085/db1106 don't boot with 4.9.0-8-amd64 from db2085 doesn't boot with 4.9.0-8-amd64 to db2085/db1106 don't boot with 4.9.0-8-amd64.
Feb 11 2019, 5:07 PM · ops-codfw, Patch-For-Review, SRE, DBA
Marostegui added a comment to T214840: db2085/db1106 don't boot with 4.9.0-8-amd64.

Same thing just happened with db1106 (PowerEdge R630 - same chassis as db2085)
@MoritzMuehlenhoff can you help us with the approach you mentioned at T214840#4918369 ?

Feb 11 2019, 5:01 PM · ops-codfw, Patch-For-Review, SRE, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 11 2019, 4:56 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a subtask for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5]: Unknown Object (Task).
Feb 11 2019, 4:46 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 11 2019, 4:45 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 11 2019, 4:44 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui claimed T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 11 2019, 4:43 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a comment to T211613: rack/setup/install db11[26-38].eqiad.wmnet.

@Cmjohnson I can take care of the installations once you've done the RAID and added DNS and pxeboot entries with the MACs :-)

Feb 11 2019, 4:07 PM · Goal, DBA, ops-eqiad, User-Marostegui, SRE
Addshore awarded T215611: MediaWiki errors overloading logstash a Baby Tequila token.
Feb 11 2019, 3:18 PM · Platform Team Workboards (Done with CPT), Platform Engineering (Needs Cleaning - Security, stability, performance, and scalability (TEC1)), Performance-Team, Wikimedia-production-error, Wikimedia-Logstash, SRE, MediaWiki-libs-Rdbms, observability
Marostegui added a project to T215616: Improve interlingual links across wikis through Wikidata IDs: MediaWiki-libs-Rdbms.
Feb 11 2019, 2:11 PM · Data-Engineering-Icebox, Analytics-Radar, Research-Freezer, MediaWiki-General, Wikidata
Marostegui added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

After having al the sections ready and compressed on all hosts, there is one thought I had, where to leave staging database, either dbstore1003 or dbstore1005.

This is the current situation:

dbstore1003: s1, s5, s7, staging

root@dbstore1003:/srv# df -hT /srv
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   4.4T  2.7T  1.8T  61% /srv
root@dbstore1003:/srv# du -sh *
926G	sqldata.s1
695G	sqldata.s5
914G	sqldata.s7
144G	sqldata.staging

dbstore1005: s6, s8, x1

root@dbstore1005:/srv# df -hT /srv
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   4.4T  1.7T  2.7T  39% /srv
root@dbstore1005:/srv# du -sh *
479G	sqldata.s6
1.1T	sqldata.s8
172G	sqldata.x1

I think it would make sense to move staging to dbstore1005

Feb 11 2019, 10:17 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a comment to T210713: Drop change_tag.ct_tag column in production.

s8 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1005
  • dbstore1002
  • db1124
  • db1116
  • db1109
  • db1104
  • db1101
  • db1099
  • db1092
  • db1087
  • db1071
Feb 11 2019, 10:14 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 11 2019, 10:14 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 11 2019, 10:13 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 11 2019, 10:12 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 11 2019, 10:12 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui closed T92739: Remove AFT tables from the analytics slaves, a subtask of T54921: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking), as Resolved.
Feb 11 2019, 9:43 AM · Epic, DBA, Tracking-Neverending
Marostegui closed T92739: Remove AFT tables from the analytics slaves, a subtask of T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5], as Resolved.
Feb 11 2019, 9:43 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui closed T92739: Remove AFT tables from the analytics slaves as Resolved.

The clicktracking no longer exist on either s1 master or dbstore1002:

root@db1067.eqiad.wmnet[enwiki]> show tables like '%click%';
Empty set (0.00 sec)
Feb 11 2019, 9:43 AM · DBA
Marostegui added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

After having al the sections ready and compressed on all hosts, there is one thought I had, where to leave staging database, either dbstore1003 or dbstore1005.

Feb 11 2019, 8:27 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 11 2019, 8:06 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

Now that we know that the biggest and and most painful table (as it was Aria and it was huge - around 180GB) was killed T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002
I have tested a migration process for the staging database, to see how much we'd need to have it on read only.

Feb 11 2019, 7:06 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a comment to T196336: Icinga passive checks go awol and downtime stops working.

I restarted icinga and they are recovering and downtimes are working again

Feb 11 2019, 7:03 AM · Observability-Alerting, SRE, Icinga, observability
Marostegui added a comment to T196336: Icinga passive checks go awol and downtime stops working.

All the passive checks went awol just now.
I tested a downtime to db1100 and it didn't work (either using icinga-downtime or the icinga web ui)
While tailing the logs during the downtimes test I got:

[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
[1549868178] External command error: Malformed command
Feb 11 2019, 6:56 AM · Observability-Alerting, SRE, Icinga, observability
Marostegui edited projects for T62962: The primary key of recentchanges (rc_id) table should be unsigned, added: User-Marostegui; removed DBA.

I am fine going to unsigned for rc_id . Once tables.sql is merged with this change, just add Schema-change-in-production as a tag (https://wikitech.wikimedia.org/wiki/Schema_changes#Workflow_of_a_schema_change) and we can take care of it :)

Feb 11 2019, 6:47 AM · MW-1.36-notes (1.36.0-wmf.33; 2021-03-02), Patch-For-Review, User-Ladsgroup, Growth-Team, MediaWiki-Recent-changes, User-Marostegui, Schema-change
Marostegui moved T63111: Convert primary key integers and references thereto from int to bigint (unsigned) from Triage to Backlog on the DBA board.

Re-opening so as to let the DBAs triage this. One question I wasn't able to answer quickly is: What units and signed-ness do all our current integer fields use in core's default schema, and in WMF production?

Feb 11 2019, 6:35 AM · MW-1.43-notes (1.43.0-wmf.4; 2024-05-07), MW-1.42-notes (1.42.0-wmf.15; 2024-01-23), MediaWiki-General, Schema-change, DBA
Marostegui added a comment to T214760: icinga1001 crashed.

Thanks for clarifying that @Volans!
Then probably failing over to another host is a good idea so we can debug icinga1001 without having service interruptions

Feb 11 2019, 6:18 AM · Patch-For-Review, ops-eqiad, observability, SRE
Marostegui added a comment to T215107: Global rename of The_Photographer → Wilfredor: supervision needed.

Sorry, I wasn't available during the weekend.
I am normally available from Monday to Friday from 7:00 UTC to 16:00 UTC

Feb 11 2019, 6:08 AM · MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), Patch-For-Review, User-MarcoAurelio, DBA, Wikimedia-Site-requests

Feb 10 2019

Marostegui added a comment to T214760: icinga1001 crashed.

I propose to failover to icinga2001 until we find out what's wrong with this one and we fix it.
Thoughts?

Feb 10 2019, 8:32 PM · Patch-For-Review, ops-eqiad, observability, SRE
Marostegui added a comment to T213670: dbstore1002 Mysql errors.

Another crash just happened:

InnoDB: We intentionally crash the server, because it appears to be hung.
2019-02-10 09:54:08 7fa2cfffe700  InnoDB: Assertion failure in thread 140337251084032 in file srv0srv.cc line 2200
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
190210  9:54:08 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Feb 10 2019, 9:57 AM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

I have disabled notifications for lag checks (only for lag ones) for dbstore1002, as they are very noisy. They will show up on icinga, but not on IRC.

Feb 10 2019, 9:25 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA

Feb 9 2019

Marostegui added a comment to T215107: Global rename of The_Photographer → Wilfredor: supervision needed.

@Marostegui Well, they're the one being renamed, so I think they just responded without realising that it have to do by steward or renamer.

Feb 9 2019, 9:06 AM · MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), Patch-For-Review, User-MarcoAurelio, DBA, Wikimedia-Site-requests
Marostegui triaged T215107: Global rename of The_Photographer → Wilfredor: supervision needed as Medium priority.
Feb 9 2019, 8:56 AM · MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), Patch-For-Review, User-MarcoAurelio, DBA, Wikimedia-Site-requests
Marostegui added a comment to T215107: Global rename of The_Photographer → Wilfredor: supervision needed.

It could be done on any moment, preferably a Friday. Thanks

Feb 9 2019, 8:56 AM · MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), Patch-For-Review, User-MarcoAurelio, DBA, Wikimedia-Site-requests

Feb 8 2019

Marostegui updated the task description for T215611: MediaWiki errors overloading logstash.
Feb 8 2019, 7:49 PM · Platform Team Workboards (Done with CPT), Platform Engineering (Needs Cleaning - Security, stability, performance, and scalability (TEC1)), Performance-Team, Wikimedia-production-error, Wikimedia-Logstash, SRE, MediaWiki-libs-Rdbms, observability
Marostegui created T215611: MediaWiki errors overloading logstash.
Feb 8 2019, 1:47 PM · Platform Team Workboards (Done with CPT), Platform Engineering (Needs Cleaning - Security, stability, performance, and scalability (TEC1)), Performance-Team, Wikimedia-production-error, Wikimedia-Logstash, SRE, MediaWiki-libs-Rdbms, observability
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 8 2019, 11:04 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 8 2019, 11:03 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui edited projects for T215589: Migrate users to dbstore100[3-5], added: User-Marostegui; removed DBA.
Feb 8 2019, 8:56 AM · User-Marostegui, Analytics-Kanban, Analytics
Marostegui added a comment to T215569: mw1299 is down (jobrunner-canary, now up but depooled).

And if crashed again with the same error:

/admin1/system1/logs1/log1-> show record13
Feb 8 2019, 7:19 AM · ops-eqiad, SRE
Marostegui assigned T215569: mw1299 is down (jobrunner-canary, now up but depooled) to RobH.

This host is under warranty until April 14, 2019 so we might want to try to debug this before it expires in case we need some replacement CPU or mainboard.

Feb 8 2019, 6:49 AM · ops-eqiad, SRE
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 8 2019, 6:35 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a comment to T215569: mw1299 is down (jobrunner-canary, now up but depooled).
/admin1/system1/logs1/log1-> show record27
Feb 8 2019, 6:28 AM · ops-eqiad, SRE
Marostegui updated the task description for T208323: Predictive failures on disk S.M.A.R.T. status.
Feb 8 2019, 6:11 AM · SRE, DBA
Marostegui closed T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002 as Resolved.

This can be closed as there are no more Aria tables on the staging database:

root@dbstore1002.eqiad.wmnet[(none)]> select TABLE_NAME from information_schema.tables where ENGINE='Aria' and TABLE_SCHEMA='staging';
Empty set (0.00 sec)
Feb 8 2019, 6:10 AM · Analytics-Kanban, User-Elukey, Analytics
Marostegui closed T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002, a subtask of T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5], as Resolved.
Feb 8 2019, 6:10 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a comment to T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.
root@DBSTORE[staging]> drop table mep_word_persistence;
Query OK, 0 rows affected (5.53 sec)
Feb 8 2019, 6:09 AM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.
root@DBSTORE[staging]> drop table mep_word_persistence;
Query OK, 0 rows affected (5.53 sec)
Feb 8 2019, 6:08 AM · Analytics-Kanban, User-Elukey, Analytics

Feb 7 2019

Marostegui closed T215050: Degraded RAID on db1073 as Resolved.

Thanks @Cmjohnson for replacing disk #6!

17:31 <+icinga-wm> RECOVERY - MegaRAID on db1073 is OK: OK: optimal, 1 logical, 2 physical, WriteBack policy
Feb 7 2019, 5:34 PM · DBA, ops-eqiad, SRE
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.

Yes. Thanks for your patience.

Feb 7 2019, 4:56 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T213706: Convert Aria/Tokudb tables to InnoDB on dbstore1002.

I finished converting mep_word_persistence to InnoDB on dbstore1003:

(1 day 9 hours 52 min 45.49 sec)
Feb 7 2019, 4:50 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.

Thanks @Halfak, can we drop the table from dbstore1002 then?

Feb 7 2019, 3:57 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.

Done - thank you

Feb 7 2019, 3:02 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.

Thanks @elukey - ok to delete the .sql file I created from dbstore1003?

Feb 7 2019, 3:00 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.

@elukey the .sql file is at: dbstore1003:/srv/tmp.staging/staging.sql
Can you grab it and store it somewhere else, so we have two copies, the HIVE one and this?

Feb 7 2019, 12:07 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T213670: dbstore1002 Mysql errors.

@elukey @jcrespo Any objection to put dbstore1002 as IDEMPOTENT?
This host crashes every single day, the data is already drifts a lot from production and this host will be depooled in a matter of weeks, it doesn't make any sense that we keep spending time to fix. Specially on x1.
If we feel x1 is absolutely necessary, we should re-import it again (although I am not 100% sure dbstore1002 wouldn't crash during the re-import)

Feb 7 2019, 11:59 AM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.

I am also mysqldumping that table, just in case.

Feb 7 2019, 10:43 AM · Analytics-Kanban, User-Elukey, Analytics
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 7 2019, 9:15 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui added a comment to T210713: Drop change_tag.ct_tag column in production.

s7 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • dbstore1002
  • db1125
  • db1116
  • db1101
  • db1098
  • db1094
  • db1090
  • db1086
  • db1079
  • db1062
Feb 7 2019, 9:14 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 7 2019, 9:13 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 7 2019, 9:07 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui closed T213664: correctable memory errors db1068 (commons primary master database) as Resolved.

And back again: RECOVERY - EDAC syslog messages on db1068 is OK: (C)4 ge (W)2 ge 1
As Jaime said: T213664#4924636 this won't be fully gone until this is fully decommissioned.

Feb 7 2019, 9:02 AM · Patch-For-Review, DBA, SRE
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.

Great - so the crash happened before. Thanks a lot for helping out here :)
Let's wait for @Halfak to verify he's got everything he needs.

Feb 7 2019, 7:37 AM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.

@JAllemandou at what time did you start the job?
From what I can see it crashed at around 18:32, so it crashed before then (if we use your comment at T215450#4932773 as a reference for time).

Feb 7 2019, 7:24 AM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T213670: dbstore1002 Mysql errors.

dbstore1002 crashed, possibly due to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002

Feb 7 2019, 6:10 AM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui added a comment to T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.

@JAllemandou dbstore1002 crashed, let's start the same thing but with less jobs I would suggest.

Feb 7 2019, 6:10 AM · Analytics-Kanban, User-Elukey, Analytics

Feb 6 2019

Marostegui added a comment to T215050: Degraded RAID on db1073.

@Cmjohnson you can proceed with the one on slot 6.
The one on slot #1 finished correctly

Feb 6 2019, 9:09 PM · DBA, ops-eqiad, SRE
Marostegui added a comment to T215445: comment and actor view challenges for Cloud Services.

To get it all in one place, here's that proposal again. It relies on the fact that https://dev.mysql.com/doc/refman/5.5/en/replication-features-differing-tables.html says the slave can have extra columns as long as they come after all the normal columns and have a default value (and that we're not using the "different data types" thing also described at that page.

  1. Add an extra column to the actor and comment tables on the WMCS copies of the tables (on sanitarium?), something like
ALTER TABLE actor ADD COLUMN wmcs_is_visible TINYINT NOT NULL DEFAULT 0;
ALTER TABLE comment ADD COLUMN wmcs_is_visible TINYINT NOT NULL DEFAULT 0;

We'll also want to create copies of all the indexes on actor and comment with wmcs_is_visible prefixed.

  1. Create a bunch of functions and triggers. Note $LOG_TYPE_LIST$ is supposed to be replaced with the list of types from allowed_logtypes in maintain-views.yaml.
    1delimiter //
    2CREATE FUNCTION wmcsCommentShow(id BIGINT) RETURNS INT NOT DETERMINISTIC READS SQL DATA
    3BEGIN
    4 RETURN COALESCE(
    5 ( SELECT 1 FROM image WHERE img_description_id = id LIMIT 1 ) OR
    6 ( SELECT 1 FROM filearchive WHERE fa_deleted_reason_id = id LIMIT 1 ) OR
    7 ( SELECT 1 FROM filearchive WHERE fa_description_id = id AND fa_deleted&2 = 0 LIMIT 1 ) OR
    8 ( SELECT 1 FROM ipblocks WHERE ipb_reason_id = id and ipb_deleted = 0 LIMIT 1 ) OR
    9 ( SELECT 1 FROM oldimage WHERE oi_description_id = id AND oi_deleted&2 = 0 LIMIT 1 ) OR
    10 ( SELECT 1 FROM protected_titles WHERE pt_reason_id = id LIMIT 1 ) OR
    11 ( SELECT 1 FROM recentchanges WHERE rc_comment_id = id AND rc_deleted&2 = 0 LIMIT 1 ) OR
    12 ( SELECT 1 FROM revision JOIN revision_comment_temp ON(revcomment_rev = rev_id) WHERE revcomment_comment_id = id AND rev_deleted&2 = 0 LIMIT 1 ) OR
    13 ( SELECT 1 FROM logging WHERE log_comment_id = id AND log_deleted&2 = 0 AND log_type IN ($LOG_TYPE_LIST$) LIMIT 1 )
    14 , 0);
    15END //
    16delimiter ;
    17
    18delimiter //
    19CREATE PROCEDURE wmcsCommentOnInsert(id BIGINT, del TINYINT) NOT DETERMINISTIC MODIFIES SQL DATA
    20BEGIN
    21 IF NOT del THEN
    22 UPDATE comment SET wmcs_is_visible = 1 WHERE comment_id = id;
    23 END IF;
    24END //
    25delimiter ;
    26
    27delimiter //
    28CREATE PROCEDURE wmcsCommentOnDelete(id BIGINT, del TINYINT) NOT DETERMINISTIC MODIFIES SQL DATA
    29BEGIN
    30 IF NOT del AND NOT wmcsCommentShow(id) THEN
    31 UPDATE comment SET wmcs_is_visible = 0 WHERE comment_id = id;
    32 END IF;
    33END //
    34delimiter ;
    35
    36delimiter //
    37CREATE PROCEDURE wmcsCommentOnUpdate(oldId BIGINT, oldDel TINYINT, newId BIGINT, newDel TINYINT) NOT DETERMINISTIC MODIFIES SQL DATA
    38BEGIN
    39 IF oldId != newId OR (NOT oldDel) != (NOT newDel) THEN
    40 CALL wmcsCommentOnDelete(oldId, oldDel);
    41 CALL wmcsCommentOnInsert(newId, newDel);
    42 END IF;
    43END //
    44delimiter ;
    45
    46CREATE TRIGGER image_wmcsCommentOnInsert AFTER INSERT ON image
    47 FOR EACH ROW CALL wmcsCommentOnInsert( NEW.img_description_id, 0 );
    48CREATE TRIGGER image_wmcsCommentOnUpdate AFTER UPDATE ON image
    49 FOR EACH ROW CALL wmcsCommentOnUpdate( OLD.img_description_id, 0, NEW.img_description_id, 0 );
    50CREATE TRIGGER image_wmcsCommentOnDelete AFTER DELETE ON image
    51 FOR EACH ROW CALL wmcsCommentOnDelete( OLD.img_description_id, 0 );
    52
    53CREATE TRIGGER filearchive_wmcsCommentOnInsert_reason AFTER INSERT ON filearchive
    54 FOR EACH ROW CALL wmcsCommentOnInsert( NEW.fa_deleted_reason_id, 0 );
    55CREATE TRIGGER filearchive_wmcsCommentOnUpdate_reason AFTER UPDATE ON filearchive
    56 FOR EACH ROW CALL wmcsCommentOnUpdate( OLD.fa_deleted_reason_id, 0, NEW.fa_deleted_reason_id, 0 );
    57CREATE TRIGGER filearchive_wmcsCommentOnDelete_reason AFTER DELETE ON filearchive
    58 FOR EACH ROW CALL wmcsCommentOnDelete( OLD.fa_deleted_reason_id, 0 );
    59CREATE TRIGGER filearchive_wmcsCommentOnInsert_description AFTER INSERT ON filearchive
    60 FOR EACH ROW CALL wmcsCommentOnInsert( NEW.fa_description_id, NEW.fa_deleted&2 );
    61CREATE TRIGGER filearchive_wmcsCommentOnUpdate_description AFTER UPDATE ON filearchive
    62 FOR EACH ROW CALL wmcsCommentOnUpdate( OLD.fa_description_id, OLD.fa_deleted&2, NEW.fa_description_id, NEW.fa_deleted&2 );
    63CREATE TRIGGER filearchive_wmcsCommentOnDelete_description AFTER DELETE ON filearchive
    64 FOR EACH ROW CALL wmcsCommentOnDelete( OLD.fa_description_id, OLD.fa_deleted&2 );
    65
    66CREATE TRIGGER ipblocks_wmcsCommentOnInsert AFTER INSERT ON ipblocks
    67 FOR EACH ROW CALL wmcsCommentOnInsert( NEW.ipb_reason_id, NEW.ipb_deleted );
    68CREATE TRIGGER ipblocks_wmcsCommentOnUpdate AFTER UPDATE ON ipblocks
    69 FOR EACH ROW CALL wmcsCommentOnUpdate( OLD.ipb_reason_id, OLD.ipb_deleted, NEW.ipb_reason_id, NEW.ipb_deleted );
    70CREATE TRIGGER ipblocks_wmcsCommentOnDelete AFTER DELETE ON ipblocks
    71 FOR EACH ROW CALL wmcsCommentOnDelete( OLD.ipb_reason_id, OLD.ipb_deleted );
    72
    73CREATE TRIGGER oldimage_wmcsCommentOnInsert AFTER INSERT ON oldimage
    74 FOR EACH ROW CALL wmcsCommentOnInsert( NEW.oi_description_id, NEW.oi_deleted&2 );
    75CREATE TRIGGER oldimage_wmcsCommentOnUpdate AFTER UPDATE ON oldimage
    76 FOR EACH ROW CALL wmcsCommentOnUpdate( OLD.oi_description_id, OLD.oi_deleted&2, NEW.oi_description_id, NEW.oi_deleted&2 );
    77CREATE TRIGGER oldimage_wmcsCommentOnDelete AFTER DELETE ON oldimage
    78 FOR EACH ROW CALL wmcsCommentOnDelete( OLD.oi_description_id, OLD.oi_deleted&2 );
    79
    80CREATE TRIGGER protected_titles_wmcsCommentOnInsert AFTER INSERT ON protected_titles
    81 FOR EACH ROW CALL wmcsCommentOnInsert( NEW.pt_reason_id, 0 );
    82CREATE TRIGGER protected_titles_wmcsCommentOnUpdate AFTER UPDATE ON protected_titles
    83 FOR EACH ROW CALL wmcsCommentOnUpdate( OLD.pt_reason_id, 0, NEW.pt_reason_id, 0 );
    84CREATE TRIGGER protected_titles_wmcsCommentOnDelete AFTER DELETE ON protected_titles
    85 FOR EACH ROW CALL wmcsCommentOnDelete( OLD.pt_reason_id, 0 );
    86
    87CREATE TRIGGER recentchanges_wmcsCommentOnInsert AFTER INSERT ON recentchanges
    88 FOR EACH ROW CALL wmcsCommentOnInsert( NEW.rc_comment_id, NEW.rc_deleted&2 );
    89CREATE TRIGGER recentchanges_wmcsCommentOnUpdate AFTER UPDATE ON recentchanges
    90 FOR EACH ROW CALL wmcsCommentOnUpdate( OLD.rc_comment_id, OLD.rc_deleted&2, NEW.rc_comment_id, NEW.rc_deleted&2 );
    91CREATE TRIGGER recentchanges_wmcsCommentOnDelete AFTER DELETE ON recentchanges
    92 FOR EACH ROW CALL wmcsCommentOnDelete( OLD.rc_comment_id, OLD.rc_deleted&2 );
    93
    94CREATE TRIGGER revision_comment_temp_wmcsCommentOnInsert AFTER INSERT ON revision_comment_temp
    95 FOR EACH ROW CALL wmcsCommentOnInsert( NEW.revcomment_comment_id, (SELECT rev_deleted&2 FROM revision WHERE rev_id=NEW.revcomment_rev) );
    96CREATE TRIGGER revision_comment_temp_wmcsCommentOnUpdate AFTER UPDATE ON revision_comment_temp
    97 FOR EACH ROW CALL wmcsCommentOnUpdate( OLD.revcomment_comment_id, (SELECT rev_deleted&2 FROM revision WHERE rev_id=OLD.revcomment_rev), NEW.revcomment_comment_id, (SELECT rev_deleted&2 FROM revision WHERE rev_id=NEW.revcomment_rev) );
    98CREATE TRIGGER revision_comment_temp_wmcsCommentOnDelete AFTER DELETE ON revision_comment_temp
    99 FOR EACH ROW CALL wmcsCommentOnDelete( OLD.revcomment_comment_id, (SELECT rev_deleted&2 FROM revision WHERE rev_id=OLD.revcomment_rev) );
    100CREATE TRIGGER revision_wmcsCommentOnInsert AFTER INSERT ON revision
    101 FOR EACH ROW CALL wmcsCommentOnInsert( (SELECT revcomment_comment_id FROM revision_comment_temp WHERE revcomment_rev=NEW.rev_id), NEW.rev_deleted&2 );
    102CREATE TRIGGER revision_wmcsCommentOnUpdate AFTER UPDATE ON revision
    103 FOR EACH ROW CALL wmcsCommentOnUpdate( (SELECT revcomment_comment_id FROM revision_comment_temp WHERE revcomment_rev=OLD.rev_id), OLD.rev_deleted&2, (SELECT revcomment_comment_id FROM revision_comment_temp WHERE revcomment_rev=NEW.rev_id), NEW.rev_deleted&2 );
    104CREATE TRIGGER revision_wmcsCommentOnDelete AFTER DELETE ON revision
    105 FOR EACH ROW CALL wmcsCommentOnDelete( (SELECT revcomment_comment_id FROM revision_comment_temp WHERE revcomment_rev=OLD.rev_id), OLD.rev_deleted&2 );
    106
    107CREATE TRIGGER logging_wmcsCommentOnInsert AFTER INSERT ON logging
    108 FOR EACH ROW CALL wmcsCommentOnInsert( NEW.log_comment_id, NOT ( NEW.log_deleted&2 = 0 AND NEW.log_type IN ($LOG_TYPE_LIST$) ) );
    109CREATE TRIGGER logging_wmcsCommentOnUpdate AFTER UPDATE ON logging
    110 FOR EACH ROW CALL wmcsCommentOnUpdate( OLD.log_comment_id, NOT ( OLD.log_deleted&2 = 0 AND OLD.log_type IN ($LOG_TYPE_LIST$) ), NEW.log_comment_id, NOT ( NEW.log_deleted&2 = 0 AND NEW.log_type IN ($LOG_TYPE_LIST$) ) );
    111CREATE TRIGGER logging_wmcsCommentOnDelete AFTER DELETE ON logging
    112 FOR EACH ROW CALL wmcsCommentOnDelete( OLD.log_comment_id, NOT ( OLD.log_deleted&2 = 0 AND OLD.log_type IN ($LOG_TYPE_LIST$) ) );
    1delimiter //
    2CREATE FUNCTION wmcsActorShow(id BIGINT) RETURNS INT NOT DETERMINISTIC READS SQL DATA
    3BEGIN
    4 RETURN COALESCE(
    5 ( SELECT 1 FROM actor JOIN user ON(actor_user=user_id) WHERE actor_id = id LIMIT 1 ) OR
    6 ( SELECT 1 FROM archive WHERE ar_actor = id AND ar_deleted&4 = 0 LIMIT 1 ) OR
    7 ( SELECT 1 FROM ipblocks WHERE ipb_by_actor = id and ipb_deleted = 0 LIMIT 1 ) OR
    8 ( SELECT 1 FROM image WHERE img_actor = id LIMIT 1 ) OR
    9 ( SELECT 1 FROM oldimage WHERE oi_actor = id AND oi_deleted&4 = 0 LIMIT 1 ) OR
    10 ( SELECT 1 FROM filearchive WHERE fa_actor = id AND fa_deleted&4 = 0 LIMIT 1 ) OR
    11 ( SELECT 1 FROM recentchanges WHERE rc_actor = id AND rc_deleted&4 = 0 LIMIT 1 ) OR
    12 ( SELECT 1 FROM logging WHERE log_actor = id AND log_deleted&4 = 0 AND log_type IN ($LOG_TYPE_LIST$) LIMIT 1 ) OR
    13 ( SELECT 1 FROM revision JOIN revision_actor_temp ON(revactor_rev = rev_id) WHERE revactor_actor = id AND rev_deleted&4 = 0 LIMIT 1 )
    14 , 0);
    15END //
    16delimiter ;
    17
    18delimiter //
    19CREATE PROCEDURE wmcsActorOnInsert(id BIGINT, del TINYINT) NOT DETERMINISTIC MODIFIES SQL DATA
    20BEGIN
    21 IF NOT del THEN
    22 UPDATE actor SET wmcs_is_visible = 1 WHERE actor_id = id;
    23 END IF;
    24END //
    25delimiter ;
    26
    27delimiter //
    28CREATE PROCEDURE wmcsActorOnDelete(id BIGINT, del TINYINT) NOT DETERMINISTIC MODIFIES SQL DATA
    29BEGIN
    30 IF NOT del AND NOT wmcsActorShow(id) THEN
    31 UPDATE actor SET wmcs_is_visible = 0 WHERE actor_id = id;
    32 END IF;
    33END //
    34delimiter ;
    35
    36delimiter //
    37CREATE PROCEDURE wmcsActorOnUpdate(oldId BIGINT, oldDel TINYINT, newId BIGINT, newDel TINYINT) NOT DETERMINISTIC MODIFIES SQL DATA
    38BEGIN
    39 IF oldId != newId OR (NOT oldDel) != (NOT newDel) THEN
    40 CALL wmcsActorOnDelete(oldId, oldDel);
    41 CALL wmcsActorOnInsert(newId, newDel);
    42 END IF;
    43END //
    44delimiter ;
    45
    46CREATE TRIGGER archive_wmcsActorOnInsert AFTER INSERT ON archive
    47 FOR EACH ROW CALL wmcsActorOnInsert( NEW.ar_actor, NEW.ar_deleted&4 );
    48CREATE TRIGGER archive_wmcsActorOnUpdate AFTER UPDATE ON archive
    49 FOR EACH ROW CALL wmcsActorOnUpdate( OLD.ar_actor, OLD.ar_deleted&4, NEW.ar_actor, NEW.ar_deleted&4 );
    50CREATE TRIGGER archive_wmcsActorOnDelete AFTER DELETE ON archive
    51 FOR EACH ROW CALL wmcsActorOnDelete( OLD.ar_actor, OLD.ar_deleted&4 );
    52
    53CREATE TRIGGER ipblocks_wmcsActorOnInsert AFTER INSERT ON ipblocks
    54 FOR EACH ROW CALL wmcsActorOnInsert( NEW.ipb_by_actor, NEW.ipb_deleted );
    55CREATE TRIGGER ipblocks_wmcsActorOnUpdate AFTER UPDATE ON ipblocks
    56 FOR EACH ROW CALL wmcsActorOnUpdate( OLD.ipb_by_actor, OLD.ipb_deleted, NEW.ipb_by_actor, NEW.ipb_deleted );
    57CREATE TRIGGER ipblocks_wmcsActorOnDelete AFTER DELETE ON ipblocks
    58 FOR EACH ROW CALL wmcsActorOnDelete( OLD.ipb_by_actor, OLD.ipb_deleted );
    59
    60CREATE TRIGGER image_wmcsActorOnInsert AFTER INSERT ON image
    61 FOR EACH ROW CALL wmcsActorOnInsert( NEW.img_actor, 0 );
    62CREATE TRIGGER image_wmcsActorOnUpdate AFTER UPDATE ON image
    63 FOR EACH ROW CALL wmcsActorOnUpdate( OLD.img_actor, 0, NEW.img_actor, 0 );
    64CREATE TRIGGER image_wmcsActorOnDelete AFTER DELETE ON image
    65 FOR EACH ROW CALL wmcsActorOnDelete( OLD.img_actor, 0 );
    66
    67CREATE TRIGGER oldimage_wmcsActorOnInsert AFTER INSERT ON oldimage
    68 FOR EACH ROW CALL wmcsActorOnInsert( NEW.oi_actor, NEW.oi_deleted&4 );
    69CREATE TRIGGER oldimage_wmcsActorOnUpdate AFTER UPDATE ON oldimage
    70 FOR EACH ROW CALL wmcsActorOnUpdate( OLD.oi_actor, OLD.oi_deleted&4, NEW.oi_actor, NEW.oi_deleted&4 );
    71CREATE TRIGGER oldimage_wmcsActorOnDelete AFTER DELETE ON oldimage
    72 FOR EACH ROW CALL wmcsActorOnDelete( OLD.oi_actor, OLD.oi_deleted&4 );
    73
    74CREATE TRIGGER filearchive_wmcsActorOnInsert AFTER INSERT ON filearchive
    75 FOR EACH ROW CALL wmcsActorOnInsert( NEW.fa_actor, NEW.fa_deleted&4 );
    76CREATE TRIGGER filearchive_wmcsActorOnUpdate AFTER UPDATE ON filearchive
    77 FOR EACH ROW CALL wmcsActorOnUpdate( OLD.fa_actor, OLD.fa_deleted&4, NEW.fa_actor, NEW.fa_deleted&4 );
    78CREATE TRIGGER filearchive_wmcsActorOnDelete AFTER DELETE ON filearchive
    79 FOR EACH ROW CALL wmcsActorOnDelete( OLD.fa_actor, OLD.fa_deleted&4 );
    80
    81CREATE TRIGGER recentchanges_wmcsActorOnInsert AFTER INSERT ON recentchanges
    82 FOR EACH ROW CALL wmcsActorOnInsert( NEW.rc_actor, NEW.rc_deleted&4 );
    83CREATE TRIGGER recentchanges_wmcsActorOnUpdate AFTER UPDATE ON recentchanges
    84 FOR EACH ROW CALL wmcsActorOnUpdate( OLD.rc_actor, OLD.rc_deleted&4, NEW.rc_actor, NEW.rc_deleted&4 );
    85CREATE TRIGGER recentchanges_wmcsActorOnDelete AFTER DELETE ON recentchanges
    86 FOR EACH ROW CALL wmcsActorOnDelete( OLD.rc_actor, OLD.rc_deleted&4 );
    87
    88CREATE TRIGGER logging_wmcsActorOnInsert AFTER INSERT ON logging
    89 FOR EACH ROW CALL wmcsActorOnInsert( NEW.log_actor, NOT ( NEW.log_deleted&4 = 0 AND NEW.log_type IN ($LOG_TYPE_LIST$) ) );
    90CREATE TRIGGER logging_wmcsActorOnUpdate AFTER UPDATE ON logging
    91 FOR EACH ROW CALL wmcsActorOnUpdate( OLD.log_actor, NOT ( OLD.log_deleted&4 = 0 AND OLD.log_type IN ($LOG_TYPE_LIST$) ), NEW.log_actor, NOT ( NEW.log_deleted&4 = 0 AND NEW.log_type IN ($LOG_TYPE_LIST$) ) );
    92CREATE TRIGGER logging_wmcsActorOnDelete AFTER DELETE ON logging
    93 FOR EACH ROW CALL wmcsActorOnDelete( OLD.log_actor, NOT ( OLD.log_deleted&4 = 0 AND OLD.log_type IN ($LOG_TYPE_LIST$) ) );
    94
    95CREATE TRIGGER revision_actor_temp_wmcsActorOnInsert AFTER INSERT ON revision_actor_temp
    96 FOR EACH ROW CALL wmcsActorOnInsert( NEW.revactor_actor, (SELECT rev_deleted&4 FROM revision WHERE rev_id=NEW.revactor_rev) );
    97CREATE TRIGGER revision_actor_temp_wmcsActorOnUpdate AFTER UPDATE ON revision_actor_temp
    98 FOR EACH ROW CALL wmcsActorOnUpdate( OLD.revactor_actor, (SELECT rev_deleted&4 FROM revision WHERE rev_id=OLD.revactor_rev), NEW.revactor_actor, (SELECT rev_deleted&4 FROM revision WHERE rev_id=NEW.revactor_rev) );
    99CREATE TRIGGER revision_actor_temp_wmcsActorOnDelete AFTER DELETE ON revision_actor_temp
    100 FOR EACH ROW CALL wmcsActorOnDelete( OLD.revactor_actor, (SELECT rev_deleted&4 FROM revision WHERE rev_id=OLD.revactor_rev) );
    101CREATE TRIGGER revision_wmcsActorOnInsert AFTER INSERT ON revision
    102 FOR EACH ROW CALL wmcsActorOnInsert( (SELECT revactor_actor FROM revision_actor_temp WHERE revactor_rev=NEW.rev_id), NEW.rev_deleted&4 );
    103CREATE TRIGGER revision_wmcsActorOnUpdate AFTER UPDATE ON revision
    104 FOR EACH ROW CALL wmcsActorOnUpdate( (SELECT revactor_actor FROM revision_actor_temp WHERE revactor_rev=OLD.rev_id), OLD.rev_deleted&4, (SELECT revactor_actor FROM revision_actor_temp WHERE revactor_rev=NEW.rev_id), NEW.rev_deleted&4 );
    105CREATE TRIGGER revision_wmcsActorOnDelete AFTER DELETE ON revision
    106 FOR EACH ROW CALL wmcsActorOnDelete( (SELECT revactor_actor FROM revision_actor_temp WHERE revactor_rev=OLD.rev_id), OLD.rev_deleted&4 );
  1. Populate wmcs_is_visible for existing rows. That might be as simple as UPDATE comment SET wmcs_is_visible = wmcsCommentShow( comment_id ) and the same for actor, or we could unroll it with a bunch of queries like UPDATE comment JOIN recentchanges ON(rc_comment_id = commend_id) SET wmcs_is_visible = 1 WHERE wmcs_is_visible = 0 AND rc_deleted&2 = 0, and/or we could run it through in batches.
  1. Update the views, instead of all the subqueries they can be just WHERE wmcs_is_visible.

Open questions:

  • Does the replication actually work like the docs say it does?
Feb 6 2019, 9:04 PM · cloud-services-team, Data-Services
Marostegui updated subscribers of T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002.
Feb 6 2019, 7:16 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui awarded T215450: Sqoop staging.mep_word_persistence to HDFS and drop the table from dbstore1002 a Love token.
Feb 6 2019, 7:09 PM · Analytics-Kanban, User-Elukey, Analytics
Marostegui added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

analytics-in4 diff:

+      term mysql-dbstore {
+          from {
+              destination-address {
+                  /* dbstore1003 */
+                  10.64.0.137/32;
+                  /* dbstore1004 */
+                  10.64.16.26/32;
+                  /* dbstore1005 */
+                  10.64.32.30/32;
+              }
+              protocol tcp;
+              destination-port [ 3311 3315 3317 3312 3313 3314 3316 3318 3320 3340 ];
+          }
+          then accept;
+      }
Feb 6 2019, 2:21 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

Then also make sure to whitelist dbstore1003:3340 as that is where the staging database will leave.

Feb 6 2019, 2:12 PM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui closed T214740: Provide access to testreduce* databases on scandium + revoke from ruthenium, a subtask of T201366: rack/setup/install scandium.eqiad.wmnet (parsoid test box), as Resolved.
Feb 6 2019, 10:45 AM · Patch-For-Review, serviceops, Parsoid, SRE
Marostegui closed T214740: Provide access to testreduce* databases on scandium + revoke from ruthenium as Resolved.

Grants revoked:

root@db1073.eqiad.wmnet[(none)]> show grants for 'ssastry'@'10.64.16.151';
ERROR 1141 (42000): There is no such grant defined for user 'ssastry' on host '10.64.16.151'
root@db1073.eqiad.wmnet[(none)]> show grants for 'testreduce'@'10.64.16.151';
ERROR 1141 (42000): There is no such grant defined for user 'testreduce' on host '10.64.16.151'
Feb 6 2019, 10:45 AM · Patch-For-Review, Parsing-Team--ARCHIVED, DBA
Marostegui added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

@elukey after merging: https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/487000/ I have done the following:

Feb 6 2019, 9:53 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 6 2019, 7:22 AM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].
Feb 6 2019, 6:43 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA

Feb 5 2019

Marostegui moved T215107: Global rename of The_Photographer → Wilfredor: supervision needed from Triage to Blocked external/Not db team on the DBA board.
Feb 5 2019, 7:43 PM · MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), Patch-For-Review, User-MarcoAurelio, DBA, Wikimedia-Site-requests
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 5 2019, 2:57 PM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui added a comment to T210713: Drop change_tag.ct_tag column in production.

s4 eqiad progress

Feb 5 2019, 2:56 PM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 5 2019, 2:56 PM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui updated the task description for T210713: Drop change_tag.ct_tag column in production.
Feb 5 2019, 2:30 PM · Schema-change-in-production, User-Ladsgroup, MediaWiki-Change-tagging
Marostegui added a comment to T210478: Migrate dbstore1002 to a multi instance setup on dbstore100[3-5].

@elukey from what I can see, the research user is used to access wikis, not only for staging. So I guess we need to create it on all the instances and not only where staging lives.
Example: T200801#4570548

Feb 5 2019, 9:52 AM · Analytics-Radar, Patch-For-Review, User-Banyek, Analytics-Kanban, DBA
Marostegui added a comment to T215231: rack/setup/install labsdb1012.eqiad.wmnet.

current labsdb hosts

Those will be on multi-instance soon(TM).

Feb 5 2019, 9:08 AM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, SRE
Marostegui added a comment to T215231: rack/setup/install labsdb1012.eqiad.wmnet.

I don't think we should setup new hosts using multi-source.

Feb 5 2019, 9:06 AM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, SRE
Marostegui added a comment to T213670: dbstore1002 Mysql errors.

An attempt to run mydumper for T210478 on dbstore1002 made it crash.

Feb 5 2019, 8:16 AM · Patch-For-Review, SRE, Product-Analytics, Analytics-Kanban, Analytics
Marostegui added a project to T215231: rack/setup/install labsdb1012.eqiad.wmnet: ops-eqiad.

I have suggested to use labsdb1012 as a hostname, as this host has the same hardware as the other labsdb1009-1011 and will be setup the same way of those hosts.

Feb 5 2019, 7:54 AM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, SRE
Marostegui triaged T215231: rack/setup/install labsdb1012.eqiad.wmnet as Medium priority.
Feb 5 2019, 7:53 AM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, SRE
Marostegui updated the task description for T133333: Audit MySQL configurations.
Feb 5 2019, 7:12 AM · Patch-For-Review, DBA
Marostegui triaged T214720: db1114 crashed (HW memory issues) as Medium priority.
Feb 5 2019, 6:53 AM · Patch-For-Review, DBA, ops-eqiad, SRE
Marostegui added a comment to T215107: Global rename of The_Photographer → Wilfredor: supervision needed.

When do you want to do this?

Feb 5 2019, 6:46 AM · MW-1.33-notes (1.33.0-wmf.18; 2019-02-19), Patch-For-Review, User-MarcoAurelio, DBA, Wikimedia-Site-requests