Page MenuHomePhabricator

Kormat (Stevie Shirley (she/her))
SRE

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Apr 14 2020, 7:57 AM (61 w, 1 d)
Availability
Available
LDAP User
Kormat
MediaWiki User
SShirley (WMF) [ Global Accounts ]

Recent Activity

Today

Kormat added a comment to T284650: Read-only window needed for s3 (~900 wikis).

Hey folks, is this on track to happen? Anything else you need from our side?

Wed, Jun 16, 12:20 PM · CommRel-Specialists-Support (Apr-Jun-2021)
Kormat edited P16418 (An Untitled Masterwork).
Wed, Jun 16, 9:54 AM
Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

TODO:

Wed, Jun 16, 9:25 AM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team
Kormat closed T284819: Deploy wmfmariadbpy 0.7.1 as Resolved.

All done.

Wed, Jun 16, 9:19 AM · DBA
Kormat updated the task description for T284819: Deploy wmfmariadbpy 0.7.1.
Wed, Jun 16, 9:19 AM · DBA
Kormat created T285034: debdeploy does not support bullseye.
Wed, Jun 16, 9:11 AM · SRE
Kormat updated the task description for T284819: Deploy wmfmariadbpy 0.7.1.
Wed, Jun 16, 9:08 AM · DBA
Kormat updated the task description for T284819: Deploy wmfmariadbpy 0.7.1.
Wed, Jun 16, 9:07 AM · DBA
Kormat updated the task description for T284819: Deploy wmfmariadbpy 0.7.1.
Wed, Jun 16, 9:05 AM · DBA
Kormat updated the task description for T284819: Deploy wmfmariadbpy 0.7.1.
Wed, Jun 16, 9:04 AM · DBA

Mon, Jun 14

Kormat edited P16418 (An Untitled Masterwork).
Mon, Jun 14, 9:07 AM

Sat, Jun 12

Kormat closed T284858: Investigate why heartbeat being stopped only causes alerts after ~24h as Invalid.

Ohh, right, of course. Never mind then :)

Sat, Jun 12, 1:52 PM · DBA
Kormat created T284858: Investigate why heartbeat being stopped only causes alerts after ~24h.
Sat, Jun 12, 1:45 PM · DBA

Fri, Jun 11

Krinkle awarded P16418 (An Untitled Masterwork) a Love token.
Fri, Jun 11, 4:10 PM
Kormat added a comment to T280605: Reduce parser cache retention temporarily for DiscussionTools.

Done from our side @ppelberg. We'll monitor over the next two weeks and report to the next task.

Fri, Jun 11, 2:06 PM · MW-1.37-notes (1.37.0-wmf.7; 2021-05-25), Editing-team (FY2020-21 Kanban Board), DBA, Patch-For-Review, Performance-Team, DiscussionTools
Kormat updated the task description for T284819: Deploy wmfmariadbpy 0.7.1.
Fri, Jun 11, 1:01 PM · DBA
Kormat created T284819: Deploy wmfmariadbpy 0.7.1.
Fri, Jun 11, 1:01 PM · DBA
Kormat committed rOSMD03f119e3da17: Prepare for 0.7.1 release. (authored by Kormat).
Prepare for 0.7.1 release.
Fri, Jun 11, 12:30 PM
Kormat edited P16418 (An Untitled Masterwork).
Fri, Jun 11, 10:53 AM
Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

Optimize of pc1009 (and replica) finished.

Fri, Jun 11, 10:28 AM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team

Thu, Jun 10

Kormat archived P16420 (An Untitled Masterwork).
Thu, Jun 10, 4:20 PM
Kormat created P16420 (An Untitled Masterwork).
Thu, Jun 10, 3:49 PM
Kormat added a comment to P16418 (An Untitled Masterwork).

Jun 8th: ~28h
Jun 9th: ~15.5h
Jun 10th: still running

Thu, Jun 10, 12:48 PM
Kormat created P16418 (An Untitled Masterwork).
Thu, Jun 10, 12:47 PM

Wed, Jun 9

Kormat added a comment to T284650: Read-only window needed for s3 (~900 wikis).

We can reuse former campaigns we created, if they are still accurate.

Can we know if any changes have been made to the s3 cluster since 2019-09-24? Any wikis added or removed from the list?

Wed, Jun 9, 6:07 PM · CommRel-Specialists-Support (Apr-Jun-2021)
Kormat added a comment to T284650: Read-only window needed for s3 (~900 wikis).

Yes please to both of those :)

Wed, Jun 9, 4:08 PM · CommRel-Specialists-Support (Apr-Jun-2021)
Kormat added a comment to T284529: Switchover s5 from db1100 to db1130.

Procedure looks good :)

Wed, Jun 9, 1:49 PM · Patch-For-Review, DBA
Kormat updated the task description for T284648: Switchover s3 from db1123 to db1157.
Wed, Jun 9, 1:18 PM · Patch-For-Review, DBA
Kormat created T284650: Read-only window needed for s3 (~900 wikis).
Wed, Jun 9, 1:17 PM · CommRel-Specialists-Support (Apr-Jun-2021)
Kormat updated the task description for T284648: Switchover s3 from db1123 to db1157.
Wed, Jun 9, 1:05 PM · Patch-For-Review, DBA
Kormat updated the task description for T284648: Switchover s3 from db1123 to db1157.
Wed, Jun 9, 1:04 PM · Patch-For-Review, DBA
Kormat updated the task description for T284648: Switchover s3 from db1123 to db1157.
Wed, Jun 9, 1:04 PM · Patch-For-Review, DBA
Kormat created T284648: Switchover s3 from db1123 to db1157.
Wed, Jun 9, 1:03 PM · Patch-For-Review, DBA
Kormat archived P16338 (An Untitled Masterwork).
Wed, Jun 9, 10:22 AM
Kormat created P16338 (An Untitled Masterwork).
Wed, Jun 9, 10:07 AM

Tue, Jun 8

Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

Optimize of pc1008 (and replica) finished.

Tue, Jun 8, 12:18 PM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team
Kormat updated the task description for T283131: Upgrade s3 to Debian Buster and MariaDB 10.4.
Tue, Jun 8, 12:10 PM · Patch-For-Review, DBA
Kormat added a comment to T283131: Upgrade s3 to Debian Buster and MariaDB 10.4.

db1157 had a clean mysqlcheck run, repooling it now.

Tue, Jun 8, 10:54 AM · Patch-For-Review, DBA

Mon, Jun 7

Kormat added a comment to T283131: Upgrade s3 to Debian Buster and MariaDB 10.4.

db1157 upgraded to buster. Running mysqlcheck now.

Mon, Jun 7, 11:56 AM · Patch-For-Review, DBA
Kormat updated the task description for T283131: Upgrade s3 to Debian Buster and MariaDB 10.4.
Mon, Jun 7, 9:55 AM · Patch-For-Review, DBA
Kormat added a comment to T284128: Re-image (rename) dbstore1006 into db1125.

Re-labelling not necessary, as it wasn't re-labelled away from db1125 in the first place: T283300

Mon, Jun 7, 9:47 AM · DBA
Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

Run finished at 2021-06-05T14:30. Running optimize over all pc* tables now.

Mon, Jun 7, 9:14 AM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team

Fri, Jun 4

Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

15:39:55 <Krinkle> kormat: it's running now, tee'ed to /home/krinkle/purge_parsercache_now_pc1008.log

Fri, Jun 4, 1:40 PM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team
Kormat added a comment to T283131: Upgrade s3 to Debian Buster and MariaDB 10.4.

s6 hasn't given any issues, so maybe we can start working on this next week (after 3 weeks since we switched s6) and attempt to do the switchover the 17th?
@Kormat thoughts?

Fri, Jun 4, 10:25 AM · Patch-For-Review, DBA

Thu, Jun 3

Kormat closed T284128: Re-image (rename) dbstore1006 into db1125 as Resolved.

It's back in tendril+zarcillo, and is a replica of db1124.

Thu, Jun 3, 12:44 PM · DBA
Kormat closed T284128: Re-image (rename) dbstore1006 into db1125, a subtask of T283125: dbstore1004 85% disk space used., as Resolved.
Thu, Jun 3, 12:44 PM · Patch-For-Review, Analytics-Clusters, Analytics-Kanban, DBA
Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

pc1010 is now pc2 primary, and is no longer replicating from pc1008:

Thu, Jun 3, 10:17 AM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team

Wed, Jun 2

Kormat added a comment to T284128: Re-image (rename) dbstore1006 into db1125.

Current status:

  • db1125 has been renamed, wiped, and reimaged
  • It still needs to be re-added to tendril/zarcillo, and have an s4 snapshot deployed on it.
Wed, Jun 2, 1:37 PM · DBA
Kormat claimed T284128: Re-image (rename) dbstore1006 into db1125.
Wed, Jun 2, 8:40 AM · DBA

Tue, Jun 1

Kormat committed rOSMD0ebce094429d: db-replication-tree: Display circular replication reasonably. (authored by Kormat).
db-replication-tree: Display circular replication reasonably.
Tue, Jun 1, 12:55 PM
Kormat added a comment to F34477128: image.png.

Example for https://gerrit.wikimedia.org/r/c/operations/software/wmfmariadbpy/+/696454

Tue, Jun 1, 12:42 PM
Kormat archived P16231 (An Untitled Masterwork).
Tue, Jun 1, 12:41 PM
Kormat added a comment to P16231 (An Untitled Masterwork).

Tue, Jun 1, 12:40 PM
Kormat closed T283793: db2094:3318 (sanitarium on codfw) needs recloning as Resolved.

Ah yes, I misunderstood you. Yes, indeed, that's why we run check_private_data after data sanitization on new wikis, so we can also get those private tables deleted.

Tue, Jun 1, 12:10 PM · DBA
Kormat added a comment to T283793: db2094:3318 (sanitarium on codfw) needs recloning.

Normally what we do is: redact_sanitarium.sh -d wikidatawiki -S socket_path | mysql -S socket_path wikidatawiki

Tue, Jun 1, 12:01 PM · DBA
Kormat added a comment to T283793: db2094:3318 (sanitarium on codfw) needs recloning.

See email - s8 reported some tables that need to be dropped

Tue, Jun 1, 11:56 AM · DBA

Thu, May 27

Kormat added a comment to T283793: db2094:3318 (sanitarium on codfw) needs recloning.

redact_sanitarium.sh completed, and a quick check showed it had been successful.

Thu, May 27, 4:17 PM · DBA
Kormat added a comment to T283793: db2094:3318 (sanitarium on codfw) needs recloning.

Status:

  • Data copy from db2082 completed.
  • mysql_upgrade ran
  • redact_sanitarium.sh currently running.
Thu, May 27, 2:30 PM · DBA
Kormat added a comment to P16231 (An Untitled Masterwork).

Example for https://gerrit.wikimedia.org/r/c/operations/software/wmfmariadbpy/+/696454

Thu, May 27, 1:48 PM
Kormat created P16231 (An Untitled Masterwork).
Thu, May 27, 1:46 PM
Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

Current status:

  • pc1 is repooled and back in service.
  • pc1010 is now in pc2, and replicating from pc1008. This means it will have at least _some_ relevant entries when it becomes pc2 primary next week.
Thu, May 27, 1:23 PM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team
Kormat added a comment to T283793: db2094:3318 (sanitarium on codfw) needs recloning.

Running:
sudo transfer.py --type file --no-compress --no-encrypt --no-checksum db2082.codfw.wmnet:/srv/sqldata db2094.codfw.wmnet:/srv/sqldata.s8

Thu, May 27, 10:41 AM · DBA
Kormat added a comment to T283793: db2094:3318 (sanitarium on codfw) needs recloning.

db2082 is db2094:s8's master:

root@db2082.codfw.wmnet[(none)]> stop slave;
Query OK, 0 rows affected (0.036 sec)
Thu, May 27, 10:33 AM · DBA
Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

Optimize of pc1007 (and replicas) finished.

Thu, May 27, 9:42 AM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team
Kormat placed T283093: Schema change for making cuc_id in cu_changes unsigned up for grabs.
Thu, May 27, 9:30 AM · DBA, Blocked-on-schema-change
Kormat placed T283499: Schema change for renaming page_timestamp index on revision table to rev_page_timestamp up for grabs.
Thu, May 27, 9:30 AM · DBA, Blocked-on-schema-change
Kormat claimed T283793: db2094:3318 (sanitarium on codfw) needs recloning.
Thu, May 27, 8:21 AM · DBA

Wed, May 26

Kormat closed T280751: Upgrade s6 to Debian Buster and MariaDB 10.4 as Resolved.

🎉

Wed, May 26, 8:47 AM · Patch-For-Review, DBA
Kormat closed T280751: Upgrade s6 to Debian Buster and MariaDB 10.4, a subtask of T250666: Upgrade WMF database-and-backup-related hosts to buster, as Resolved.
Wed, May 26, 8:47 AM · Data-Persistence, Patch-For-Review, Epic
Kormat added a comment to T283580: Data Persistence IRC channels updates.

We don't receive gerrit updates on the channel, do we?

Wed, May 26, 8:33 AM · Data-Persistence-Misc
Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

The purge has finished as of 2021-05-26T06:00Z. I'll start the optimize process now.

Wed, May 26, 8:08 AM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team

Tue, May 25

Kormat added a comment to T283239: db-replication-tree doesn't support circular replication.

I wouldn't say it's very _urgent_, but it would definitely be nice to have it done before the dc switchover preperations, for sanity-checking that circular replication is set up correctly.

Tue, May 25, 11:55 AM · DBA
Kormat added a comment to T283580: Data Persistence IRC channels updates.

I'd vote for a non-'bot'-specific channel name (so -bulk or -firehose or similar). Other than that, LGTM.

Tue, May 25, 11:51 AM · Data-Persistence-Misc
Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

Current status:

  • pc1010 is now the primary for pc1
  • I've run stop slave on pc1010, so it no longer replicates from pc1007
  • I've created a downtime for 7 days for pc[2007,2010].codfw.wmnet,pc1007.eqiad.wmnet
Tue, May 25, 9:06 AM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team

Fri, May 21

Kormat updated the title for P16127 Time for another one of my patented shell scripts from Time for another one of my pattented shell scripts to Time for another one of my patented shell scripts.
Fri, May 21, 12:00 PM
Kormat added a comment to T283228: Deploy wmfmariadbpy 0.7.

For posterity, here's the script i used for the heartbeat changes:

1#!/bin/bash
2fqdn="${1:?}"
3
4sudo -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh -t -o ConnectTimeout=3 root@$fqdn 'set -ex;
5ps -wwwf `pgrep -P 1 -f pt-heartbeat-wikimedia`;
6systemctl is-active pt-heartbeat-wikimedia && exit 0
7pkill -P 1 -f pt-heartbeat-wikimedia;
8systemctl start pt-heartbeat-wikimedia;
9sleep 2;
10systemctl is-active pt-heartbeat-wikimedia'
11sudo -H mysql.py -h $fqdn heartbeat -e 'select TIMEDIFF(UTC_TIMESTAMP(6), ts) from heartbeat'

Fri, May 21, 12:00 PM · DBA
Kormat added a comment to T252528: wmf-auto-reinstall fails on hosts that run pt-heartbeat.

This does mean that pt-heartbeat-wikimedia needs to be started manually after a boot, however.

@Kormat is this captured somewhere in documentation?

Fri, May 21, 11:56 AM · DBA, SRE
Kormat added a comment to T276589: migrate services from cumin2001 to cumin2002.

The grant for cumin2002 should now be fully deployed.

Fri, May 21, 10:17 AM · Patch-For-Review, SRE
Kormat added a comment to T282761: purgeParserCache.php should not take over 24 hours for its daily run.

Optimize of pc1010 finished.

Fri, May 21, 8:53 AM · Patch-For-Review, Parsoid (Tracking), MediaWiki-Parser, DBA, Performance-Team
Kormat closed T272954: Fix db-switchover update zarcillo part, a subtask of T271427: Switchover s4 (commonswiki) from db1081 to db1138, as Resolved.
Fri, May 21, 8:48 AM · DBA
Kormat closed T272954: Fix db-switchover update zarcillo part as Resolved.

Yes indeed 🎉

Fri, May 21, 8:48 AM · DBA

Thu, May 20

Kormat closed T252528: wmf-auto-reinstall fails on hosts that run pt-heartbeat as Resolved.

This is now fixed. Puppet will no longer start/stop heartbeat. That is managed by db-switchover when changing masters. This does mean that pt-heartbeat-wikimedia needs to be started manually after a boot, however.

Thu, May 20, 3:08 PM · DBA, SRE
Kormat closed T283228: Deploy wmfmariadbpy 0.7 as Resolved.

Deployment complete:

kormat@cumin1001:~(0:0)$ sudo debdeploy deploy -u 2021-05-20-wmfmariadbpy.yaml -Q C:wmfmariadbpy
Rolling out wmfmariadbpy:
Non-daemon update, no service restart needed
Thu, May 20, 2:32 PM · DBA
Kormat updated the task description for T283228: Deploy wmfmariadbpy 0.7.
Thu, May 20, 2:31 PM · DBA
Kormat added a comment to T283228: Deploy wmfmariadbpy 0.7.

pt-heartbeat-wikimedia fails to start on db2093 with:

Thu, May 20, 2:09 PM · DBA
Kormat added a comment to T283228: Deploy wmfmariadbpy 0.7.

Heartbeat restarted on all primaries.

Thu, May 20, 1:48 PM · DBA
Kormat edited P16127 Time for another one of my patented shell scripts.
Thu, May 20, 1:46 PM
Kormat created P16127 Time for another one of my patented shell scripts.
Thu, May 20, 1:34 PM
Kormat updated the task description for T283228: Deploy wmfmariadbpy 0.7.
Thu, May 20, 1:02 PM · DBA
Kormat updated the task description for T283228: Deploy wmfmariadbpy 0.7.
Thu, May 20, 1:01 PM · DBA
Kormat updated the task description for T283228: Deploy wmfmariadbpy 0.7.
Thu, May 20, 1:00 PM · DBA
Kormat added a comment to T283228: Deploy wmfmariadbpy 0.7.

pt-heartbeat-wikimedia fails to start on db2093 with:

DBD::mysql::st execute failed: Cannot execute statement: impossible to write to binary log since BINLOG_FORMAT = STATEMENT and at least one table uses a storage engine limited to row-based logging. InnoDB is limited to row-logging when transaction isolation level is READ COMMITTED or READ UNCOMMITTED.

This is due to the unusual config of the dbinventory section.

Thu, May 20, 12:58 PM · DBA
Kormat updated the task description for T283228: Deploy wmfmariadbpy 0.7.
Thu, May 20, 12:49 PM · DBA
Kormat triaged T283239: db-replication-tree doesn't support circular replication as Medium priority.
Thu, May 20, 12:44 PM · DBA
Kormat triaged T283228: Deploy wmfmariadbpy 0.7 as Medium priority.
Thu, May 20, 12:44 PM · DBA
Kormat created T283239: db-replication-tree doesn't support circular replication.
Thu, May 20, 12:44 PM · DBA
Kormat updated the task description for T283228: Deploy wmfmariadbpy 0.7.
Thu, May 20, 12:39 PM · DBA
Kormat updated the task description for T283228: Deploy wmfmariadbpy 0.7.
Thu, May 20, 12:36 PM · DBA
Kormat updated the task description for T283228: Deploy wmfmariadbpy 0.7.
Thu, May 20, 12:32 PM · DBA
Kormat moved T283228: Deploy wmfmariadbpy 0.7 from Triage to In progress on the DBA board.
Thu, May 20, 10:26 AM · DBA