Marostegui (Manuel Aróstegui)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Sep 1 2016, 6:48 AM (98 w, 12 h)
Availability
Available
IRC Nick
marostegui
LDAP User
Marostegui
MediaWiki User
MArostegui (WMF) [ Global Accounts ]

TZ: UTC +1/+2

Recent Activity

Today

Marostegui added a comment to T200039: db1067 /srv usage is at 82%.

Actually it is totally fine to delete it, it has not been touched for years, ie:

-rw-rw---- 1  998 prometheus-node-exporter 1001M Mar 15  2017 db1067-bin.003293
Thu, Jul 19, 6:57 PM · DBA
Marostegui added a comment to T200039: db1067 /srv usage is at 82%.

Those two files can go away, they are leftovers. However it will only free up 12GB.
The main issue is:

root@db1067:/srv# du -sh *
1.5T	sqldata
1.2T	sqldata.s2.bak
12G	tmp
Thu, Jul 19, 6:56 PM · DBA
Marostegui moved T200035: DB backup restore skip empty databases from Triage to Blocked external/Not db team on the DBA board.
Thu, Jul 19, 6:03 PM · Upstream, DBA
Marostegui added a comment to T183585: Rack/cable/configure asw2-b-eqiad switch stack.

Aiming at doing the asw-b to asw2-b migration on July 31st (3pm UTC, 11am EDT, 8am PDT), 4h.
due to people's vacations, we might have to do that move in waves. What can't be moved on that day will move in a later window.

Thu, Jul 19, 2:55 PM · cloud-services-team, Cloud-VPS, ops-eqiad, Operations
Marostegui edited projects for T141255: Separate host lookup from the sql shell script, added: MediaWiki-Maintenance-scripts; removed DBA.
Thu, Jul 19, 2:15 PM · MediaWiki-Maintenance-scripts, Operations
Marostegui updated the task description for T187089: Fix WMF schemas to not break when comment store goes WRITE_NEW.
Thu, Jul 19, 2:12 PM · Core-Platform-Team, Patch-For-Review, Schema-change, DBA
Marostegui added a comment to T144010: Drop eu_touched in production.

s2 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1122
  • db1105
  • db1103
  • db1090
  • db1076
  • db1074
  • db1066
Thu, Jul 19, 8:33 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui added a comment to T199368: Convert UNIQUE INDEX to PK in Production.

s2 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1122
  • db1105
  • db1103
  • db1090
  • db1076
  • db1074
  • db1066
Thu, Jul 19, 8:33 AM · Patch-For-Review, DBA, Schema-change
Marostegui added a comment to T51190: Truncate SHA-1 indexes.

s2 eqiad progress

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1002
  • db1125
  • db1122
  • db1105
  • db1103
  • db1090
  • db1076
  • db1074
  • db1066
Thu, Jul 19, 8:33 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui updated the task description for T144010: Drop eu_touched in production.
Thu, Jul 19, 8:29 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui updated the task description for T199368: Convert UNIQUE INDEX to PK in Production.
Thu, Jul 19, 8:29 AM · Patch-For-Review, DBA, Schema-change
Marostegui updated the task description for T51190: Truncate SHA-1 indexes.
Thu, Jul 19, 8:29 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui closed T199790: Special:Log/Fanghong results in fatal exception of type "Wikimedia\Rdbms\DBQueryTimeoutError" as Resolved.

The partitioning has finished and the query now finishes instantly, like in db1097. The explain is now fine:

+------+--------------------+---------------------+--------+------------------------------+-----------+---------+------------------------------------+-------+-------------+
| id   | select_type        | table               | type   | possible_keys                | key       | key_len | ref                                | rows  | Extra       |
+------+--------------------+---------------------+--------+------------------------------+-----------+---------+------------------------------------+-------+-------------+
|    1 | PRIMARY            | logging             | range  | type_time,user_time,times    | user_time | 20      | NULL                               | 11832 | Using where |
|    1 | PRIMARY            | comment_log_comment | eq_ref | PRIMARY                      | PRIMARY   | 8       | commonswiki.logging.log_comment_id |     1 |             |
|    1 | PRIMARY            | user                | eq_ref | PRIMARY                      | PRIMARY   | 4       | commonswiki.logging.log_user       |     1 |             |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | ct_log_id,change_tag_log_tag | ct_log_id | 5       | commonswiki.logging.log_id         |     1 | Using index |
+------+--------------------+---------------------+--------+------------------------------+-----------+---------+------------------------------------+-------+-------------+
Thu, Jul 19, 5:11 AM · Patch-For-Review, DBA, Commons, User-Josve05a, MediaWiki-Database, MediaWiki-Special-pages, Wikimedia-log-errors

Yesterday

Marostegui edited projects for T199008: sql enwik gives a poor error message when db doesn't exist, added: Core-Platform-Team; removed DBA.
Wed, Jul 18, 6:56 PM · Core-Platform-Team, MediaWiki-Maintenance-scripts, Operations
Marostegui added a comment to T198093: Add a safe failover for analytics1003.

This should be ok since we already have some dbproxies whitelisted in the analytics vlan's firewall, so it should be a matter of adding another one.

I don't think that is ok- we need at least 2 proxies and 2 dedicated databases on the analytics network (4 each if we go multi-dc).

Sorry my understanding when I read about the db misc cluster was that we'd have checked for room on existing db hosts in the production network, not the need for new hosts from Analytics (the latter use case is not possible for us, too much overhead plus we don't have budget for such an infrastructure for this fiscal).

Wed, Jul 18, 4:30 PM · User-Elukey, Analytics
Marostegui updated the task description for T89737: Make several mediawiki table fields unsigned ints on wmf databases.
Wed, Jul 18, 12:58 PM · Patch-For-Review, Blocked-on-schema-change, DBA
Marostegui closed T195193: Schema change for ct_tag_id field to change_tag as Resolved.
Wed, Jul 18, 11:03 AM · Patch-For-Review, Blocked-on-schema-change, Wikidata-Ministry-Of-Magic, MediaWiki-Database, MediaWiki-Change-tagging
Marostegui closed T195193: Schema change for ct_tag_id field to change_tag, a subtask of T193867: Create the change_tag_def table and add the ct_tag_id field to change_tag, as Resolved.
Wed, Jul 18, 11:03 AM · MW-1.32-release-notes (WMF-deploy-2018-05-22 (1.32.0-wmf.5)), Patch-For-Review, Schema-change, Wikidata-Ministry-Of-Magic, MediaWiki-Database, MediaWiki-Change-tagging, User-Ladsgroup
Marostegui added a comment to T195193: Schema change for ct_tag_id field to change_tag.

db1052 was failed over today and now got the schema change so this is all done.

Wed, Jul 18, 11:02 AM · Patch-For-Review, Blocked-on-schema-change, Wikidata-Ministry-Of-Magic, MediaWiki-Database, MediaWiki-Change-tagging
Marostegui moved T149077: Certain ApiQueryRecentChanges::run api query is too slow, slowing down dewiki from Triage to Blocked external/Not db team on the DBA board.
Wed, Jul 18, 10:04 AM · Patch-For-Review, Performance, MediaWiki-API, DBA
Marostegui moved T197531: Data model for dbconfig from Done to Blocked external/Not db team on the DBA board.
Wed, Jul 18, 10:04 AM · Patch-For-Review, MediaWiki-Configuration, Operations, DBA
Marostegui moved T197531: Data model for dbconfig from Triage to Done on the DBA board.
Wed, Jul 18, 10:04 AM · Patch-For-Review, MediaWiki-Configuration, Operations, DBA
Marostegui moved T199790: Special:Log/Fanghong results in fatal exception of type "Wikimedia\Rdbms\DBQueryTimeoutError" from Triage to In progress on the DBA board.
Wed, Jul 18, 10:04 AM · Patch-For-Review, DBA, Commons, User-Josve05a, MediaWiki-Database, MediaWiki-Special-pages, Wikimedia-log-errors
Marostegui closed T192926: Schema change to drop archive.ar_text and archive.ar_flags as Resolved.

db1052 was failed over today and now got the schema change so this is all done.

Wed, Jul 18, 9:41 AM · Patch-For-Review, DBA, Multi-Content-Revisions, Schema-change
Marostegui closed T192926: Schema change to drop archive.ar_text and archive.ar_flags, a subtask of T33223: Remove old archive.ar_text/archive.ar_flags, as Resolved.
Wed, Jul 18, 9:41 AM · Multi-Content-Revisions (MCR-SDC Storage Layer - phase 1), MW-1.31-release-notes, MW-1.32-release-notes (WMF-deploy-2018-04-24 (1.32.0-wmf.1)), MediaWiki-Platform-Team (MWPT-Q4-Apr-Jun-2018), Schema-change, MediaWiki-Database
Marostegui updated the task description for T192926: Schema change to drop archive.ar_text and archive.ar_flags.
Wed, Jul 18, 9:41 AM · Patch-For-Review, DBA, Multi-Content-Revisions, Schema-change
Marostegui added a comment to T199790: Special:Log/Fanghong results in fatal exception of type "Wikimedia\Rdbms\DBQueryTimeoutError".

I will be running this on db1103 to let it with the same schema as the rest of rc slaves in production across all the sections - I will also double check if we have more rc servers with this similar issue:

ALTER TABLE logging
  DROP PRIMARY KEY,
  ADD PRIMARY KEY (log_id, log_user)
  PARTITION BY RANGE (log_user) (
  PARTITION p1 VALUES LESS THAN (1),
  PARTITION p10000 VALUES LESS THAN (10000),
  PARTITION p20000 VALUES LESS THAN (20000),
  PARTITION p30000 VALUES LESS THAN (30000),
  PARTITION p40000 VALUES LESS THAN (40000),
  PARTITION p50000 VALUES LESS THAN (50000),
  PARTITION p100000 VALUES LESS THAN (100000),
  PARTITION p150000 VALUES LESS THAN (150000),
  PARTITION p200000 VALUES LESS THAN (200000),
  PARTITION p300000 VALUES LESS THAN (300000),
  PARTITION p400000 VALUES LESS THAN (400000),
  PARTITION p500000 VALUES LESS THAN (500000),
  PARTITION p750000 VALUES LESS THAN (750000),
  PARTITION p1000000 VALUES LESS THAN (1000000),
  PARTITION pMAXVALUE VALUES LESS THAN MAXVALUE );
Wed, Jul 18, 9:40 AM · Patch-For-Review, DBA, Commons, User-Josve05a, MediaWiki-Database, MediaWiki-Special-pages, Wikimedia-log-errors
Marostegui closed T191316: Schema change to make archive.ar_rev_id NOT NULL as Resolved.

db1052 was failed over today and now got the schema change so this is all done.

Wed, Jul 18, 8:34 AM · Core-Platform-Team, Patch-For-Review, DBA, Schema-change
Marostegui updated the task description for T191316: Schema change to make archive.ar_rev_id NOT NULL.
Wed, Jul 18, 8:34 AM · Core-Platform-Team, Patch-For-Review, DBA, Schema-change
Marostegui added a comment to T197073: switchover es1014 to es1017.

Good for me!

Wed, Jul 18, 7:18 AM · Patch-For-Review, DBA
Marostegui claimed T199861: Decommission db1052.
Wed, Jul 18, 6:57 AM · Patch-For-Review, DBA
jcrespo awarded T197069: Failover db1052 (s1) db primary master a 100 token.
Wed, Jul 18, 6:43 AM · Patch-For-Review, DBA
Marostegui closed T197069: Failover db1052 (s1) db primary master, a subtask of T183585: Rack/cable/configure asw2-b-eqiad switch stack, as Resolved.
Wed, Jul 18, 6:43 AM · cloud-services-team, Cloud-VPS, ops-eqiad, Operations
Marostegui closed T197069: Failover db1052 (s1) db primary master as Resolved.
Wed, Jul 18, 6:43 AM · Patch-For-Review, DBA
Marostegui closed T197069: Failover db1052 (s1) db primary master, a subtask of T186320: Decommission db1051-db1060 (DBA tracking), as Resolved.
Wed, Jul 18, 6:43 AM · Patch-For-Review, DBA
Marostegui added a comment to T197069: Failover db1052 (s1) db primary master.

Resolving this as it has all be done - including the clean up tasks.

Wed, Jul 18, 6:42 AM · Patch-For-Review, DBA
Marostegui updated the task description for T197069: Failover db1052 (s1) db primary master.
Wed, Jul 18, 6:41 AM · Patch-For-Review, DBA
Marostegui added a comment to T199861: Decommission db1052.

I thought a bit about how to go over this, and given the importance and history of this host, this would be one proposal, see what you think about it:

  • Wait 1 week to make sure we are not going to fail back immediately
Wed, Jul 18, 6:40 AM · Patch-For-Review, DBA
Marostegui triaged T199861: Decommission db1052 as Normal priority.
Wed, Jul 18, 6:36 AM · Patch-For-Review, DBA
Marostegui added a comment to T197134: Announce 30 minutes read-only time for enwiki 18th July 06:00AM UTC.

This was smoothly done.
Read only times:

Wed, Jul 18, 6:15 AM · CommRel-Specialists-Support (Jul-Sep-2018), User-Johan
Marostegui added a comment to T197069: Failover db1052 (s1) db primary master.

This was smoothly done.
Read only times:

Wed, Jul 18, 6:15 AM · Patch-For-Review, DBA
Marostegui moved T196547: Extension:JADE scalability concerns due to creating a page per revision from Backlog to Blocked external/Not db team on the DBA board.
Wed, Jul 18, 5:43 AM · TechCom-RFC, DBA, Scoring-platform-team (Current), User-Joe, Operations, JADE
Marostegui added a comment to T198156: Server-side deletion of User:LorenzoMilano/sandbox.

@jcrespo Could you ask in ops@lists... if there's anyone that whishes to do it? Thanks.

Wed, Jul 18, 5:15 AM · MediaWiki-Database, Wikimedia-Site-requests
Marostegui added a comment to T199790: Special:Log/Fanghong results in fatal exception of type "Wikimedia\Rdbms\DBQueryTimeoutError".

db1103 and db1097 are both using 10.1.31.
I will upgrade db1103 to 10.1.34 but I don't think that will help anyways - I guess we have to partition the table and then later decide how to overcome the future once log_user is dropped.

Wed, Jul 18, 4:52 AM · Patch-For-Review, DBA, Commons, User-Josve05a, MediaWiki-Database, MediaWiki-Special-pages, Wikimedia-log-errors

Tue, Jul 17

Marostegui added a comment to T199790: Special:Log/Fanghong results in fatal exception of type "Wikimedia\Rdbms\DBQueryTimeoutError".
/*!50100 PARTITION BY RANGE (log_user)

What's going to happen there when we eventually drop log_user in favor of log_actor?

Tue, Jul 17, 9:01 PM · Patch-For-Review, DBA, Commons, User-Josve05a, MediaWiki-Database, MediaWiki-Special-pages, Wikimedia-log-errors
Marostegui added a project to T199790: Special:Log/Fanghong results in fatal exception of type "Wikimedia\Rdbms\DBQueryTimeoutError": DBA.

I have narrowed this to this query:

SELECT /* IndexPager::buildQueryInfo (LogPager) */  log_id,log_type,log_action,log_timestamp,log_namespace,log_title,log_params,log_deleted,user_id,user_name,user_editcount,COALESCE( comment_log_comment.comment_text, log_comment ) AS `log_comment_text`,comment_log_comment.comment_data AS `log_comment_data`,comment_log_comment.comment_id AS `log_comment_cid`,log_user,log_user_text,NULL AS `log_actor`,(SELECT  GROUP_CONCAT(ct_tag SEPARATOR ',')  FROM `change_tag`    WHERE ct_log_id=log_id  ) AS `ts_tags`  FROM `logging` LEFT JOIN `comment` `comment_log_comment` ON ((comment_log_comment.comment_id = log_comment_id)) LEFT JOIN `user` ON ((user_id=log_user))   WHERE (log_type NOT IN ('spamblacklist','titleblacklist','abusefilterprivatedetails','suppress')) AND ((log_user = 'redacted')) AND ((log_deleted & 4) = 0) AND (log_type != 'thanks') AND (log_type != 'patrol') AND (log_type != 'tag') AND (log_timestamp<'20180625020905')  ORDER BY log_timestamp DESC LIMIT 51;
Tue, Jul 17, 8:32 PM · Patch-For-Review, DBA, Commons, User-Josve05a, MediaWiki-Database, MediaWiki-Special-pages, Wikimedia-log-errors
Marostegui closed T199759: Degraded RAID on db2061 as Resolved.

All good - thank you!

root@db2061:~# hpssacli controller all show config
Tue, Jul 17, 4:32 PM · DBA, Operations, ops-codfw
Marostegui added a comment to T195228: db2064 crashed and totally broken - decommission it.

Or servers that we already decommissioned?

Tue, Jul 17, 2:58 PM · decommission, ops-codfw, Operations, DBA
Marostegui added a comment to T198987: Gather statistics about the backups on a database.

What's the file_date vs backup_date? backup_date is when the backup started and file_date when the file was last modified on the filesystem?

Tue, Jul 17, 12:53 PM · Patch-For-Review, DBA
Marostegui updated the task description for T51190: Truncate SHA-1 indexes.
Tue, Jul 17, 8:30 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui updated the task description for T199368: Convert UNIQUE INDEX to PK in Production.
Tue, Jul 17, 8:30 AM · Patch-For-Review, DBA, Schema-change
Marostegui updated the task description for T144010: Drop eu_touched in production.
Tue, Jul 17, 8:30 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui added a comment to T144010: Drop eu_touched in production.

s6 eqiad progress

Tue, Jul 17, 8:29 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui added a comment to T199368: Convert UNIQUE INDEX to PK in Production.

s6 eqiad progress

Tue, Jul 17, 8:29 AM · Patch-For-Review, DBA, Schema-change
Marostegui added a comment to T51190: Truncate SHA-1 indexes.

s6 eqiad progress

Tue, Jul 17, 8:29 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui updated the task description for T144010: Drop eu_touched in production.
Tue, Jul 17, 7:12 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui updated the task description for T199368: Convert UNIQUE INDEX to PK in Production.
Tue, Jul 17, 7:11 AM · Patch-For-Review, DBA, Schema-change
Marostegui updated the task description for T51190: Truncate SHA-1 indexes.
Tue, Jul 17, 7:11 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui updated the task description for T144010: Drop eu_touched in production.
Tue, Jul 17, 5:53 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui updated the task description for T199368: Convert UNIQUE INDEX to PK in Production.
Tue, Jul 17, 5:53 AM · Patch-For-Review, DBA, Schema-change
Marostegui updated the task description for T51190: Truncate SHA-1 indexes.
Tue, Jul 17, 5:53 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui moved T199759: Degraded RAID on db2061 from Triage to In progress on the DBA board.
Tue, Jul 17, 4:39 AM · DBA, Operations, ops-codfw
Marostegui assigned T199759: Degraded RAID on db2061 to Papaul.

Can we get a replacement?
Thanks!

Tue, Jul 17, 4:39 AM · DBA, Operations, ops-codfw

Mon, Jul 16

Marostegui updated the task description for T183585: Rack/cable/configure asw2-b-eqiad switch stack.
Mon, Jul 16, 3:52 PM · cloud-services-team, Cloud-VPS, ops-eqiad, Operations
Marostegui updated subscribers of T183585: Rack/cable/configure asw2-b-eqiad switch stack.

@ayounsi regarding databases

Mon, Jul 16, 3:45 PM · cloud-services-team, Cloud-VPS, ops-eqiad, Operations
Marostegui claimed T197069: Failover db1052 (s1) db primary master.
Mon, Jul 16, 3:44 PM · Patch-For-Review, DBA
Marostegui updated the task description for T144010: Drop eu_touched in production.
Mon, Jul 16, 1:48 PM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui updated the task description for T199368: Convert UNIQUE INDEX to PK in Production.
Mon, Jul 16, 1:47 PM · Patch-For-Review, DBA, Schema-change
Marostegui updated the task description for T51190: Truncate SHA-1 indexes.
Mon, Jul 16, 1:47 PM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui added a comment to T197134: Announce 30 minutes read-only time for enwiki 18th July 06:00AM UTC.

It will show: Scheduled maintenance on enwiki from 06:00-06:30 UTC..

That is ambiguous. Drop the "from", or write "to" instead of the dash.

Mon, Jul 16, 9:59 AM · CommRel-Specialists-Support (Jul-Sep-2018), User-Johan
Marostegui added a comment to T197134: Announce 30 minutes read-only time for enwiki 18th July 06:00AM UTC.

Will $wgReadOnly show a message while read-only?

(See also my comments/questions about SiteNotice/CentralNotice at the en:WP:VPT topic.)

Mon, Jul 16, 9:24 AM · CommRel-Specialists-Support (Jul-Sep-2018), User-Johan
Marostegui reopened T199056: db1069 bad disk as "Open".

This has happened again, same disk, disk #0, can we get another one?
Please ping me before replacing it so I can manually put it offline

Enclosure Device ID: 32
Slot Number: 0
Drive's position: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 1
Device Id: 0
WWN: 5000C500479122EC
Sequence Number: 24
Media Error Count: 1097
Other Error Count: 0
Predictive Failure Count: 3
Last Predictive Failure Event Seq Number: 5211
PD Type: SAS
Mon, Jul 16, 7:43 AM · Operations, ops-eqiad, DBA
Marostegui added a comment to T51190: Truncate SHA-1 indexes.

This is how the tables look like after the alters (same as they were basically):

       Table: oldimage
Create Table: CREATE TABLE `oldimage` (
  `oi_name` varbinary(255) NOT NULL DEFAULT '',
  `oi_archive_name` varbinary(255) NOT NULL DEFAULT '',
  `oi_size` int(8) unsigned NOT NULL DEFAULT '0',
  `oi_width` int(5) NOT NULL DEFAULT '0',
  `oi_height` int(5) NOT NULL DEFAULT '0',
  `oi_bits` int(3) NOT NULL DEFAULT '0',
  `oi_description` varbinary(255) NOT NULL DEFAULT '',
  `oi_description_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `oi_user` int(5) unsigned NOT NULL DEFAULT '0',
  `oi_user_text` varbinary(255) NOT NULL DEFAULT '',
  `oi_actor` bigint(20) unsigned NOT NULL DEFAULT '0',
  `oi_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `oi_metadata` mediumblob NOT NULL,
  `oi_media_type` enum('UNKNOWN','BITMAP','DRAWING','AUDIO','VIDEO','MULTIMEDIA','OFFICE','TEXT','EXECUTABLE','ARCHIVE','3D') DEFAULT NULL,
  `oi_major_mime` enum('unknown','application','audio','image','text','video','message','model','multipart') NOT NULL DEFAULT 'unknown',
  `oi_minor_mime` varbinary(100) NOT NULL DEFAULT 'unknown',
  `oi_deleted` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `oi_sha1` varbinary(32) NOT NULL DEFAULT '',
  KEY `oi_name_timestamp` (`oi_name`,`oi_timestamp`),
  KEY `oi_name_archive_name` (`oi_name`,`oi_archive_name`(14)),
  KEY `oi_usertext_timestamp` (`oi_user_text`,`oi_timestamp`),
  KEY `oi_actor_timestamp` (`oi_actor`,`oi_timestamp`),
  KEY `oi_sha1` (`oi_sha1`(10))
) ENGINE=InnoDB DEFAULT CHARSET=binary
*************************** 1. row ***************************
       Table: image
Create Table: CREATE TABLE `image` (
  `img_name` varbinary(255) NOT NULL DEFAULT '',
  `img_size` int(8) unsigned NOT NULL DEFAULT '0',
  `img_width` int(5) NOT NULL DEFAULT '0',
  `img_height` int(5) NOT NULL DEFAULT '0',
  `img_metadata` mediumblob NOT NULL,
  `img_bits` int(3) NOT NULL DEFAULT '0',
  `img_media_type` enum('UNKNOWN','BITMAP','DRAWING','AUDIO','VIDEO','MULTIMEDIA','OFFICE','TEXT','EXECUTABLE','ARCHIVE','3D') DEFAULT NULL,
  `img_major_mime` enum('unknown','application','audio','image','text','video','message','model','multipart') NOT NULL DEFAULT 'unknown',
  `img_minor_mime` varbinary(100) NOT NULL DEFAULT 'unknown',
  `img_description` varbinary(255) NOT NULL DEFAULT '',
  `img_description_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `img_user` int(5) unsigned NOT NULL DEFAULT '0',
  `img_user_text` varbinary(255) NOT NULL DEFAULT '',
  `img_actor` bigint(20) unsigned NOT NULL DEFAULT '0',
  `img_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `img_sha1` varbinary(32) NOT NULL DEFAULT '',
  `img_deleted` tinyint(3) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`img_name`),
  KEY `img_size` (`img_size`),
  KEY `img_timestamp` (`img_timestamp`),
  KEY `img_usertext_timestamp` (`img_user_text`,`img_timestamp`),
  KEY `img_media_mime` (`img_media_type`,`img_major_mime`,`img_minor_mime`),
  KEY `img_user_timestamp` (`img_user`,`img_timestamp`),
  KEY `img_actor_timestamp` (`img_actor`,`img_timestamp`),
  KEY `img_sha1` (`img_sha1`(10))
) ENGINE=InnoDB DEFAULT CHARSET=binary
Mon, Jul 16, 7:04 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui added a comment to T199368: Convert UNIQUE INDEX to PK in Production.

These are how the tables look like after the alters:

*************************** 1. row ***************************
       Table: interwiki
Create Table: CREATE TABLE `interwiki` (
  `iw_prefix` varbinary(32) NOT NULL,
  `iw_url` varbinary(127) NOT NULL,
  `iw_local` tinyint(1) NOT NULL DEFAULT '0',
  `iw_trans` tinyint(1) NOT NULL DEFAULT '0',
  `iw_api` blob NOT NULL,
  `iw_wikiid` varbinary(64) NOT NULL,
  PRIMARY KEY (`iw_prefix`)
) ENGINE=InnoDB DEFAULT CHARSET=binary
*************************** 1. row ***************************
       Table: page_props
Create Table: CREATE TABLE `page_props` (
  `pp_page` int(11) NOT NULL DEFAULT '0',
  `pp_propname` varbinary(60) NOT NULL DEFAULT '',
  `pp_value` blob NOT NULL,
  `pp_sortkey` float DEFAULT NULL,
  PRIMARY KEY (`pp_page`,`pp_propname`),
  UNIQUE KEY `pp_propname_page` (`pp_propname`,`pp_page`),
  UNIQUE KEY `pp_propname_sortkey_page` (`pp_propname`,`pp_sortkey`,`pp_page`)
) ENGINE=InnoDB DEFAULT CHARSET=binary
*************************** 1. row ***************************
       Table: protected_titles
Create Table: CREATE TABLE `protected_titles` (
  `pt_namespace` int(11) NOT NULL DEFAULT '0',
  `pt_title` varbinary(255) NOT NULL DEFAULT '',
  `pt_user` int(10) unsigned NOT NULL DEFAULT '0',
  `pt_reason` varbinary(255) DEFAULT '',
  `pt_reason_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `pt_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `pt_expiry` varbinary(14) NOT NULL DEFAULT '',
  `pt_create_perm` varbinary(60) NOT NULL DEFAULT '',
  PRIMARY KEY (`pt_namespace`,`pt_title`),
  KEY `pt_timestamp` (`pt_timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=binary
*************************** 1. row ***************************
       Table: site_identifiers
Create Table: CREATE TABLE `site_identifiers` (
  `si_site` int(10) unsigned NOT NULL,
  `si_type` varbinary(32) NOT NULL,
  `si_key` varbinary(32) NOT NULL,
  PRIMARY KEY (`si_type`,`si_key`),
  KEY `site_ids_site` (`si_site`),
  KEY `site_ids_key` (`si_key`)
) ENGINE=InnoDB DEFAULT CHARSET=binary
Mon, Jul 16, 7:03 AM · Patch-For-Review, DBA, Schema-change
Marostegui added a comment to T199368: Convert UNIQUE INDEX to PK in Production.

As always with the first iteration of a schema change - I will do them host by host in codfw and once we are sure all is fine, the passive DC will be done with replication.
So this is s5 progress for now:

Mon, Jul 16, 5:15 AM · Patch-For-Review, DBA, Schema-change
Marostegui added a comment to T144010: Drop eu_touched in production.

As always with the first iteration of a schema change - I will do them host by host in codfw and once we are sure all is fine, the passive DC will be done with replication.
So this is s5 progress for now:

Mon, Jul 16, 5:15 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui added a comment to T51190: Truncate SHA-1 indexes.

As always with the first iteration of a schema change - I will do them host by host in codfw and once we are sure all is fine, the passive DC will be done with replication.
So this is s5 progress for now:

Mon, Jul 16, 5:15 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui updated the task description for T199368: Convert UNIQUE INDEX to PK in Production.
Mon, Jul 16, 5:13 AM · Patch-For-Review, DBA, Schema-change
Marostegui updated the task description for T51190: Truncate SHA-1 indexes.
Mon, Jul 16, 5:13 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui updated the task description for T144010: Drop eu_touched in production.
Mon, Jul 16, 5:13 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata

Sun, Jul 15

Marostegui assigned T199636: Degraded RAID on db1072 to Cmjohnson.
Sun, Jul 15, 5:40 AM · DBA, ops-eqiad, Operations
Marostegui triaged T199636: Degraded RAID on db1072 as Normal priority.
Sun, Jul 15, 5:40 AM · DBA, ops-eqiad, Operations
Marostegui added a project to T199636: Degraded RAID on db1072: DBA.

Can we this disk replaced?
Thanks!

Sun, Jul 15, 5:40 AM · DBA, ops-eqiad, Operations

Sat, Jul 14

Marostegui closed T199614: dbstore1002 MySQL crashed and got restarted as Resolved.
Sat, Jul 14, 12:07 PM · Analytics
Marostegui added a comment to T199614: dbstore1002 MySQL crashed and got restarted.
nnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
2018-07-14 11:25:57 7fdf5d3fd700  InnoDB: Assertion failure in thread 140597318899456 in file srv0srv.cc line 2200
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
180714 11:25:57 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Sat, Jul 14, 12:05 PM · Analytics
Marostegui created T199614: dbstore1002 MySQL crashed and got restarted.
Sat, Jul 14, 12:00 PM · Analytics
Marostegui moved T199599: Prepare and check storage layer for zhwikiversity from Triage to Blocked external/Not db team on the DBA board.

Let us know when this is created so we can sanitize it

Sat, Jul 14, 8:08 AM · cloud-services-team (Kanban), DBA, Cloud-Services, Chinese-Sites, User-Urbanecm

Fri, Jul 13

Marostegui added a comment to T144010: Drop eu_touched in production.

Similar to T144010#2737391 I have renamed the column on db1110 in eqiad, which is a host that receives reads. We'll see if some errors arise.

root@db1110[dewiki]> show create table wbc_entity_usage;
+------------------+--------------------------------------------------------------------------------------------------------------------
| Table            | Create Table
+------------------+--------------------------------------------------------------------------------------------------------------------
| wbc_entity_usage | CREATE TABLE `wbc_entity_usage` (
  `eu_row_id` bigint(20) NOT NULL AUTO_INCREMENT,
  `eu_entity_id` varbinary(255) NOT NULL,
  `eu_aspect` varbinary(37) NOT NULL,
  `eu_page_id` int(11) NOT NULL,
  `TO_DROP_eu_touched` binary(14) NOT NULL DEFAULT '\0\0\0\0\0\0\0\0\0\0\0\0\0\0',
  PRIMARY KEY (`eu_row_id`),
  UNIQUE KEY `eu_entity_id` (`eu_entity_id`,`eu_aspect`,`eu_page_id`),
  KEY `eu_page_id` (`eu_page_id`,`eu_entity_id`)
) ENGINE=InnoDB AUTO_INCREMENT=32592567 DEFAULT CHARSET=binary |
Fri, Jul 13, 5:08 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui added a comment to T192092: setup replacements for maintenance_server (terbium, wasat) on Stretch.

I have seen this error (one in the last 8 hours):

cli_argv	       	/srv/mediawiki/multiversion/MWScript.php maintenance/cleanupUploadStash.php --wiki=labtestwiki
t  db_name	       	labtestwiki
t  db_server	       	10.64.16.79
t  db_user	       	wikiadmin
t  error	       	Access denied for user 'wikiadmin'@'%' to database 'labtestwiki'
t  host	       	mwmaint1001
t  level	       	ERROR
t  message	       	Error connecting to 10.64.16.79: Access denied for user 'wikiadmin'@'%' to database 'labtestwiki'
Fri, Jul 13, 4:54 AM · Patch-For-Review, Operations

Thu, Jul 12

Marostegui updated the task description for T197069: Failover db1052 (s1) db primary master.
Thu, Jul 12, 7:27 AM · Patch-For-Review, DBA
Marostegui added a comment to T197069: Failover db1052 (s1) db primary master.

So, I have restarted db1083 with binlog format = STATEMENT.
This host is ready to be the candidate master once db1067 is the new master.

Thu, Jul 12, 7:27 AM · Patch-For-Review, DBA
Marostegui updated the task description for T197069: Failover db1052 (s1) db primary master.
Thu, Jul 12, 7:06 AM · Patch-For-Review, DBA
Marostegui moved T144010: Drop eu_touched in production from Backlog to In progress on the Blocked-on-schema-change board.
Thu, Jul 12, 5:16 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui claimed T144010: Drop eu_touched in production.

I will go ahead and start deploying this change. It has been renamed for more than a year without any issues (so nothing is writing to it). I will deploy this change to a host in eqiad to make sure there are also no reads touching it.

Thu, Jul 12, 5:15 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change, Wikidata
Marostegui moved T51190: Truncate SHA-1 indexes from Backlog to In progress on the Blocked-on-schema-change board.
Thu, Jul 12, 5:13 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui claimed T51190: Truncate SHA-1 indexes.
Thu, Jul 12, 5:12 AM · Patch-For-Review, Blocked-on-schema-change, DBA, Schema-change
Marostegui closed T187521: Optimize recentchanges and wbc_entity_usage table across wikis as Resolved.

The lists of pending wikis have been done

Thu, Jul 12, 5:10 AM · Wikidata, DBA
Marostegui updated the task description for T187521: Optimize recentchanges and wbc_entity_usage table across wikis.
Thu, Jul 12, 5:10 AM · Wikidata, DBA
Marostegui claimed T199368: Convert UNIQUE INDEX to PK in Production.
Thu, Jul 12, 5:09 AM · Patch-For-Review, DBA, Schema-change
Marostegui updated the task description for T59176: ApiQueryExtLinksUsage::run query has crazy limit.
Thu, Jul 12, 5:05 AM · MW-1.32-release-notes (WMF-deploy-2018-05-22 (1.32.0-wmf.5)), MW-1.29-release-notes, Patch-For-Review, Schema-change, DBA, MediaWiki-API, Performance, MediaWiki-Database