Marostegui (Manuel Aróstegui)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Sep 1 2016, 6:48 AM (71 w, 5 d)
Availability
Available
IRC Nick
marostegui
LDAP User
Marostegui
MediaWiki User
MArostegui (WMF)

Recent Activity

Today

Marostegui added a comment to T183983: Re-institute query killer for the analytics WikiReplica.

I have not seen any more delays on the wiki replicas since this was set up, so those thresholds are looking pretty good!

Tue, Jan 16, 8:29 AM · Data-Services, DBA
Marostegui updated the task description for T174569: Schema change for refactored comment storage.
Tue, Jan 16, 8:27 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui claimed T184397: Decommission db1030.
Tue, Jan 16, 7:21 AM · Patch-For-Review, DBA
Marostegui added a comment to T174569: Schema change for refactored comment storage.

s8 codfw has been done.
Let's track here s8 eqiad progress:

Tue, Jan 16, 7:01 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui updated the task description for T174569: Schema change for refactored comment storage.
Tue, Jan 16, 6:12 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA

Yesterday

Marostegui merged T184946: Degraded RAID on db2036 into T184836: Degraded RAID on db2036.
Mon, Jan 15, 7:40 PM · DBA, Operations, ops-codfw
Marostegui merged task T184946: Degraded RAID on db2036 into T184836: Degraded RAID on db2036.
Mon, Jan 15, 7:40 PM · Operations, ops-codfw
Marostegui added a comment to T184888: Failed BBU on db2033 (x1 master).

The server kept lagging.
I have forced the controller to go to WriteBack temporarily till we decide how to proceed with this host.

root@db2033:~# hpssacli controller all show detail | grep "Drive Write Cache"
   Drive Write Cache: Disabled
hpssacli ctrl slot=0 modify dwc=enable
root@db2033:~# hpssacli controller all show detail | grep "Drive Write Cache"
   Drive Write Cache: Enabled
Mon, Jan 15, 4:38 PM · ops-codfw, DBA, Operations
Marostegui added a comment to T184888: Failed BBU on db2033 (x1 master).

Maybe we should force this host to be WB even without the BBU to make sure it catches up: https://grafana.wikimedia.org/dashboard/db/mysql?orgId=1&var-dc=codfw%20prometheus%2Fops&var-server=db2033&var-port=9104&panelId=6&fullscreen&from=1515936574721&to=1516022974721

Mon, Jan 15, 1:29 PM · ops-codfw, DBA, Operations
Marostegui archived P6586 (An Untitled Masterwork).
Mon, Jan 15, 1:22 PM
Marostegui created P6586 (An Untitled Masterwork).
Mon, Jan 15, 1:16 PM
Marostegui added a comment to T162807: Run pt-table-checksum on s1 (enwiki).

Archive table is now fixed across all the servers.
Next: change_tag

Mon, Jan 15, 12:00 PM · Patch-For-Review, DBA
Marostegui added a comment to T142807: Migrate all users to new Wiki Replica cluster and decommission old hardware.

labsdb1003 RAID policy started to fail and it is now on WT instead of WB.
Possibly the BBU is failing.

Mon, Jan 15, 11:55 AM · Patch-For-Review, Goal, cloud-services-team (FY2017-18), Data-Services, DBA
Marostegui added a comment to T184397: Decommission db1030.

It is exactly that :)

Mon, Jan 15, 8:58 AM · Patch-For-Review, DBA
Marostegui updated the task description for T174569: Schema change for refactored comment storage.
Mon, Jan 15, 8:51 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui added a comment to T174569: Schema change for refactored comment storage.

s5 master is done

Mon, Jan 15, 8:51 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui updated subscribers of T184397: Decommission db1030.

I have set db1087 as vslow in s8 instead of db1063, which is the first step to move db1063 as vslow in s6, so we can get rid of db1030.

Mon, Jan 15, 7:34 AM · Patch-For-Review, DBA
Marostegui updated the task description for T174569: Schema change for refactored comment storage.
Mon, Jan 15, 7:18 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui updated subscribers of T184888: Failed BBU on db2033 (x1 master).

Thanks @Papaul - I have checked the hosts that will soon be decommissioned and none of them are HP.
@RobH any ideas on what can we do about this?

Mon, Jan 15, 6:35 AM · ops-codfw, DBA, Operations
Marostegui triaged T184836: Degraded RAID on db2036 as Normal priority.
Mon, Jan 15, 6:16 AM · DBA, Operations, ops-codfw
Marostegui triaged T184888: Failed BBU on db2033 (x1 master) as Normal priority.
Mon, Jan 15, 6:16 AM · ops-codfw, DBA, Operations
Marostegui created T184888: Failed BBU on db2033 (x1 master).
Mon, Jan 15, 6:15 AM · ops-codfw, DBA, Operations

Sat, Jan 13

Marostegui edited projects for T184832: Decommission labsdb1001 and labsdb1003, added: ops-eqiad, hardware-requests; removed DC-Ops.
Sat, Jan 13, 6:17 AM · Patch-For-Review, hardware-requests, ops-eqiad, Operations, cloud-services-team (Kanban)
Marostegui assigned T184836: Degraded RAID on db2036 to Papaul.

This host is out of warranty, but maybe @Papaul has some spare disks somewhere?

Sat, Jan 13, 6:16 AM · DBA, Operations, ops-codfw

Fri, Jan 12

Marostegui added a comment to T142807: Migrate all users to new Wiki Replica cluster and decommission old hardware.

I think we are ready to shutdown labsdb1001 (which actually had another storage crash today) and labsdb1003. The _p databases there have been archived which was the last blocker.

Fri, Jan 12, 9:29 PM · Patch-For-Review, Goal, cloud-services-team (FY2017-18), Data-Services, DBA
Marostegui added a comment to T162807: Run pt-table-checksum on s1 (enwiki).

After a whole week I have almost fixed archive table across all the servers.
On Monday I hope I will be done with it and start with the next table: change_tag

Fri, Jan 12, 5:28 PM · Patch-For-Review, DBA
Marostegui added a comment to T179464: labsdb1001 crashed - storage issue.

labsdb1001 is no longer available, not even in read_only.
storage has definitely given up.

Fri, Jan 12, 12:53 PM · Operations, cloud-services-team (Kanban)
Marostegui updated the task description for T174569: Schema change for refactored comment storage.
Fri, Jan 12, 7:56 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui added a comment to T174569: Schema change for refactored comment storage.

s5 is almost done, only pending the master, which I will do on Monday

Fri, Jan 12, 7:55 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA

Thu, Jan 11

Marostegui moved T184697: Failover existing eqiad database backup system to the new codfw database logical backup system from Backlog to Next on the DBA board.
Thu, Jan 11, 11:39 AM · DBA
Marostegui moved T184696: Finish the database backups generation script to create consistent logical backups in CODFW from Backlog to Next on the DBA board.
Thu, Jan 11, 11:39 AM · DBA
Marostegui moved T59617: Make watchlist table available as curated foo_p.watchlist_count on labsdb from Next to Backlog on the DBA board.
Thu, Jan 11, 11:38 AM · Patch-For-Review, DBA, Cloud-Services
Marostegui moved T184704: Setup tendril database monitoring on 2 new hosts, one on eqiad and one on codfw from Backlog to Next on the DBA board.
Thu, Jan 11, 11:38 AM · DBA
Marostegui moved T184703: Decommission db1011 from Backlog to Next on the DBA board.
Thu, Jan 11, 11:38 AM · DBA
Marostegui moved T157359: labsdb1006/1007 (postgresql) maintenance from Next to Backlog on the DBA board.
Thu, Jan 11, 11:38 AM · Patch-For-Review, DBA, Cloud-VPS, Cloud-Services, Operations
Marostegui triaged T184704: Setup tendril database monitoring on 2 new hosts, one on eqiad and one on codfw as Normal priority.
Thu, Jan 11, 11:33 AM · DBA
Marostegui moved T184696: Finish the database backups generation script to create consistent logical backups in CODFW from Triage to Backlog on the DBA board.
Thu, Jan 11, 11:33 AM · DBA
Marostegui moved T184697: Failover existing eqiad database backup system to the new codfw database logical backup system from Triage to Backlog on the DBA board.
Thu, Jan 11, 11:33 AM · DBA
Marostegui moved T184699: Generate consistent logical database backups in CODFW from Triage to Meta/Epic on the DBA board.
Thu, Jan 11, 11:33 AM · Operations, Goal, DBA
Marostegui moved T184704: Setup tendril database monitoring on 2 new hosts, one on eqiad and one on codfw from Triage to Backlog on the DBA board.
Thu, Jan 11, 11:33 AM · DBA
Marostegui moved T184703: Decommission db1011 from Triage to Backlog on the DBA board.
Thu, Jan 11, 11:32 AM · DBA
Marostegui triaged T184703: Decommission db1011 as Normal priority.
Thu, Jan 11, 11:32 AM · DBA
Marostegui updated the task description for T183735: Check data consistency across production shards.
Thu, Jan 11, 11:21 AM · DBA
Marostegui removed a project from T183735: Check data consistency across production shards: Goal.
Thu, Jan 11, 11:21 AM · DBA
Marostegui triaged T184699: Generate consistent logical database backups in CODFW as Normal priority.
Thu, Jan 11, 11:18 AM · Operations, Goal, DBA
Marostegui triaged T184697: Failover existing eqiad database backup system to the new codfw database logical backup system as Normal priority.
Thu, Jan 11, 11:14 AM · DBA
Marostegui triaged T184696: Finish the database backups generation script to create consistent logical backups in CODFW as Normal priority.
Thu, Jan 11, 11:09 AM · DBA
Marostegui closed T184247: Drop `external_user` from all databases as Resolved.

All done

Thu, Jan 11, 7:39 AM · DBA
Marostegui closed T184247: Drop `external_user` from all databases, a subtask of T54921: Database tables to be dropped on Wikimedia wikis and other WMF databases (tracking), as Resolved.
Thu, Jan 11, 7:39 AM · Epic, DBA, Tracking
Marostegui updated the task description for T184247: Drop `external_user` from all databases.
Thu, Jan 11, 7:38 AM · DBA

Wed, Jan 10

Marostegui added a comment to T184464: Degraded RAID on db2060.

Thanks - will close once it has finished:

logicaldrive 1 (3.3 TB, RAID 1+0, Recovering, 3% complete)
physicaldrive 1I:1:3 (port 1I:box 1:bay 3, SAS, 600 GB, Rebuilding)
Wed, Jan 10, 5:00 PM · DBA, Operations, ops-codfw
Marostegui added a comment to T174569: Schema change for refactored comment storage.

Mentioned in SAL (#wikimedia-operations) [2018-01-10T16:03:09Z] <marostegui> Deploy schema change on db1095.s5 - https://phabricator.wikimedia.org/T174569

Wed, Jan 10, 4:04 PM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui added a comment to T184160: db1059 BBU issues.

As per our chat, this will be done tomorrow

Wed, Jan 10, 3:54 PM · ops-eqiad, DBA, Operations
Marostegui added a comment to T184160: db1059 BBU issues.

@Cmjohnson you want me to power off the server and we can do it now?

Wed, Jan 10, 3:52 PM · ops-eqiad, DBA, Operations
Marostegui moved T184599: s5 wikidatawiki database cleanup from Triage to In progress on the DBA board.
Wed, Jan 10, 3:41 PM · DBA
Marostegui updated the task description for T184599: s5 wikidatawiki database cleanup.
Wed, Jan 10, 3:38 PM · DBA
Marostegui added a comment to T181731: Run maintenance/cleanupUsersWithNoId.php on all wikis.

I am currently running the comment refactoring schema change on s5. Once done, I will go for s8.

Wed, Jan 10, 10:07 AM · Wikimedia-Site-requests, MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), User-notice, Wikimedia-maintenance-script-run
Marostegui updated the task description for T174569: Schema change for refactored comment storage.
Wed, Jan 10, 8:16 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui added a comment to T174569: Schema change for refactored comment storage.

Tracking s5 eqiad here:

Wed, Jan 10, 8:13 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui updated the task description for T174569: Schema change for refactored comment storage.
Wed, Jan 10, 8:10 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui added a comment to T174569: Schema change for refactored comment storage.

s5 codfw is done

Wed, Jan 10, 8:10 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui updated the task description for T184247: Drop `external_user` from all databases.
Wed, Jan 10, 7:38 AM · DBA
Marostegui added a comment to T184247: Drop `external_user` from all databases.
root@db1071[wikidatawiki]> select count(*) from external_user;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.00 sec)
Wed, Jan 10, 7:37 AM · DBA
Marostegui added a comment to T177208: Provide dedicated database resources for wikidata.

I believe we are good to close this task after Bryan finished with the pending Cloud Team's tasks?

Wed, Jan 10, 6:39 AM · Patch-For-Review, Operations, Goal
Marostegui added a comment to T177223: Determine schema differences between labsdb1001 and labsdb1009.

The index ep_articles_course_id was deleted per: T180166
The index wb_ips_site_page was deleted per: T179793

Wed, Jan 10, 6:25 AM · cloud-services-team (Kanban), User-bd808, DBA, Data-Services

Tue, Jan 9

Marostegui moved T183758: Create backups of user tables from decommissioned database servers from In progress to Done on the DBA board.

Thanks for double checking.
I have left it at: labstore1003:/root/labsdb_backup.tar
And its md5sum looks good: d4bde7692b1c18e55079e57fa4aa8316

Tue, Jan 9, 6:18 PM · cloud-services-team (Kanban), Data-Services, DBA
Marostegui moved T183758: Create backups of user tables from decommissioned database servers from Triage to In progress on the DBA board.
Tue, Jan 9, 3:41 PM · cloud-services-team (Kanban), Data-Services, DBA
Marostegui added a comment to T181643: Announce wikidata move to s8 to cloud-announce & update wiki docs.

This was successfully done earlier today. So believe this task is fine to be closed.

Tue, Jan 9, 1:36 PM · cloud-services-team (Kanban), Data-Services
Marostegui added a comment to T177208: Provide dedicated database resources for wikidata.

From the checksums I did past weeks it was indeed pretty inconsistent in all the shards so I don't think it is a big deal to have a added a bit more drifts.
It needs a full reimport anyways.

Tue, Jan 9, 11:37 AM · Patch-For-Review, Operations, Goal
Marostegui updated the task description for T177208: Provide dedicated database resources for wikidata.
Tue, Jan 9, 9:23 AM · Patch-For-Review, Operations, Goal
Marostegui added a comment to T177208: Provide dedicated database resources for wikidata.

dbstore1002 is fixed

Tue, Jan 9, 9:22 AM · Patch-For-Review, Operations, Goal
Marostegui added a comment to T183758: Create backups of user tables from decommissioned database servers.

Thanks a lot @bd808 for double checking.
I have removed those files and this is now how it looks like:

tar -tvf labsdb_backup.tar
drwxr-xr-x root/root         0 2018-01-09 07:02 labsdb1001/
-rw-r--r-- root/root   1540413 2018-01-08 09:22 labsdb1001/p50380g50491__rlrl_enwiki_p.sql.gz
-rw-r--r-- root/root  67686263 2018-01-08 09:22 labsdb1001/p50380g50491__rlrl_ptwiki_p.sql.gz
-rw-r--r-- root/root 134297986 2018-01-08 09:22 labsdb1001/p50380g50491__unlikely_enwiki_p.sql.gz
-rw-r--r-- root/root  14872310 2018-01-08 09:23 labsdb1001/p50380g50491__unlikely_ptwiki_p.sql.gz
-rw-r--r-- root/root   1893487 2018-01-08 09:23 labsdb1001/p50380g50592__interwikis_p.sql.gz
-rw-r--r-- root/root 630229340 2018-01-08 09:27 labsdb1001/p50380g50692__DPL_p.sql.gz
-rw-r--r-- root/root  14820225 2018-01-08 09:27 labsdb1001/p50380g50728__hostbot_p.sql.gz
-rw-r--r-- root/root 507634491 2018-01-08 09:32 labsdb1001/p50380g50921__ghel_p.sql.gz
-rw-r--r-- root/root 1168737652 2018-01-08 09:34 labsdb1001/p50380g50921__wma_p.sql.gz
-rw-r--r-- root/root   46506188 2018-01-08 09:35 labsdb1001/s51111__common_p.sql.gz
-rw-r--r-- root/root     286297 2018-01-08 09:35 labsdb1001/s51111__inconsistent_redirects_p.sql.gz
-rw-r--r-- root/root      23366 2018-01-08 09:35 labsdb1001/s51111__oddlinks_p.sql.gz
-rw-r--r-- root/root  236275347 2018-01-08 09:35 labsdb1001/s51111__rlrl_enwiki_p.sql.gz
-rw-r--r-- root/root   62494301 2018-01-08 09:35 labsdb1001/s51111__rlrl_ptwiki_p.sql.gz
-rw-r--r-- root/root   47970227 2018-01-08 09:36 labsdb1001/s51111__unlikely_enwiki_p.sql.gz
-rw-r--r-- root/root   14899115 2018-01-08 09:36 labsdb1001/s51111__unlikely_ptwiki_p.sql.gz
-rw-r--r-- root/root    2963980 2018-01-08 09:36 labsdb1001/s51127__dewiki_Bilderwunsch_p.sql.gz
-rw-r--r-- root/root       1338 2018-01-08 09:36 labsdb1001/s51127__stats_p.sql.gz
-rw-r--r-- root/root       1044 2018-01-08 09:36 labsdb1001/s51206__ptwikis_p.sql.gz
-rw-r--r-- root/root    9412636 2018-01-08 09:36 labsdb1001/s51306__copyright_p.sql.gz
-rw-r--r-- root/root        489 2018-01-08 09:36 labsdb1001/s51306__plagiabot.sql.gz
-rw-r--r-- root/root     189543 2018-01-08 09:36 labsdb1001/s51892_toolserverdb_p.sql.gz
-rw-r--r-- root/root   26446474 2018-01-08 09:36 labsdb1001/s52490__hashtags_p.sql.gz
-rw-r--r-- root/root  264654985 2018-01-08 09:37 labsdb1001/s52690__p.sql.gz
-rw-r--r-- root/root      13015 2018-01-08 09:37 labsdb1001/s53012__fikarummet_p.sql.gz
-rw-r--r-- root/root      19360 2018-01-08 09:37 labsdb1001/s53024__sandbox_p.sql.gz
-rw-r--r-- root/root  420537419 2018-01-08 09:37 labsdb1001/s53311__wikidata_usage_and_views_p.sql.gz
-rw-r--r-- root/root  441124557 2018-01-08 09:38 labsdb1001/u12219__wikied_p.sql.gz
-rw-r--r-- root/root        482 2018-01-08 09:38 labsdb1001/u2029__p.sql.gz
-rw-r--r-- root/root   39321067 2018-01-08 09:38 labsdb1001/u2041__botvbot_p.sql.gz
-rw-r--r-- root/root      87298 2018-01-08 09:45 labsdb1001/u2041__thr_p.sql.gz
-rw-r--r-- root/root   12393968 2018-01-08 09:45 labsdb1001/u2402__p.sql.gz
-rw-r--r-- root/root  850514180 2018-01-08 09:46 labsdb1001/u2815__p.sql.gz
-rw-r--r-- root/root  131022113 2018-01-08 09:46 labsdb1001/u3182__wp10_p.sql.gz
-rw-r--r-- root/root    5366707 2018-01-08 09:46 labsdb1001/u4974__ores_tmp_p.sql.gz
-rw-r--r-- root/root 3128833245 2018-01-08 09:55 labsdb1001/u2041__ores_p.sql.gz
drwxr-xr-x root/root          0 2018-01-09 07:11 labsdb1003/
-rw-r--r-- root/root    6141643 2018-01-08 08:48 labsdb1003/p50380g50491__rlrl_cawiki_p.sql.gz
-rw-r--r-- root/root   44656606 2018-01-08 08:48 labsdb1003/p50380g50491__rlrl_frwiki_p.sql.gz
-rw-r--r-- root/root      31975 2018-01-08 08:49 labsdb1003/p50380g50491__rlrl_lvwiki_p.sql.gz
-rw-r--r-- root/root    1736028 2018-01-08 08:49 labsdb1003/p50380g50491__unlikely_lvwiki_p.sql.gz
-rw-r--r-- root/root  634801813 2018-01-08 08:52 labsdb1003/p50380g50692__DPL_p.sql.gz
-rw-r--r-- root/root  547379090 2018-01-08 08:53 labsdb1003/p50380g50921__ghel_p.sql.gz
-rw-r--r-- root/root 1063647794 2018-01-08 08:57 labsdb1003/p50380g50921__wma_p.sql.gz
-rw-r--r-- root/root 1048637664 2018-01-08 09:00 labsdb1003/s51072__dwl_p.sql.gz
-rw-r--r-- root/root      98620 2018-01-08 09:00 labsdb1003/s51111__common_p.sql.gz
-rw-r--r-- root/root    6141652 2018-01-08 09:01 labsdb1003/s51111__rlrl_cawiki_p.sql.gz
-rw-r--r-- root/root  126973377 2018-01-08 09:01 labsdb1003/s51111__rlrl_frwiki_p.sql.gz
-rw-r--r-- root/root      31957 2018-01-08 09:01 labsdb1003/s51111__rlrl_lvwiki_p.sql.gz
-rw-r--r-- root/root    1736027 2018-01-08 09:01 labsdb1003/s51111__unlikely_lvwiki_p.sql.gz
-rw-r--r-- root/root     915399 2018-01-08 09:02 labsdb1003/s51430__bswiki_first_page_revisions_p.sql.gz
-rw-r--r-- root/root     603811 2018-01-08 09:02 labsdb1003/s51892_toolserverdb_p.sql.gz
-rw-r--r-- root/root        483 2018-01-08 09:02 labsdb1003/s52690__p.sql.gz
-rw-r--r-- root/root    3692505 2018-01-08 09:02 labsdb1003/s52861__bwAPI_p.sql.gz
-rw-r--r-- root/root       1356 2018-01-08 09:02 labsdb1003/s53012__fikarummet_p.sql.gz
-rw-r--r-- root/root      16070 2018-01-08 09:02 labsdb1003/u2170__meta_p.sql.gz
-rw-r--r-- root/root  760700588 2018-01-08 09:03 labsdb1003/u2815__delete_p.sql.gz
-rw-r--r-- root/root  487761449 2018-01-08 09:05 labsdb1003/u2815__p.sql.gz
-rw-r--r-- root/root     813792 2018-01-08 09:18 labsdb1003/u2718__bswiki_first_page_revisions_p.sql.gz
Tue, Jan 9, 7:46 AM · cloud-services-team (Kanban), Data-Services, DBA
Johan awarded T181645: Help communicate read-only time for dewiki and wikidata for database split a Like token.
Tue, Jan 9, 6:47 AM · User-notice, Patch-For-Review, Wikidata, Community-Liaisons (Jan-Mar-2018)
Marostegui added a comment to T174569: Schema change for refactored comment storage.

Once s7 is done, we only have s5 and s8 pending, which is blocked on changing the master to STATEMENT based replication - which will happen on Tuesday with the failover (T181645).

Tue, Jan 9, 6:36 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui updated the task description for T174569: Schema change for refactored comment storage.
Tue, Jan 9, 6:35 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui added a comment to T177274: Provide a new s8 master for sanitarium.

This can now proceed as the master has been failed over

Tue, Jan 9, 6:34 AM · Data-Services, DBA
Marostegui added a comment to T177208: Provide dedicated database resources for wikidata.
Tue, Jan 9, 6:32 AM · Patch-For-Review, Operations, Goal
Marostegui added a comment to T177208: Provide dedicated database resources for wikidata.
Tue, Jan 9, 6:32 AM · Patch-For-Review, Operations, Goal
Marostegui updated subscribers of T177208: Provide dedicated database resources for wikidata.

Failover is done
Read only started: 06:01
Read only finished: 06:14

Tue, Jan 9, 6:32 AM · Patch-For-Review, Operations, Goal
Marostegui closed T181645: Help communicate read-only time for dewiki and wikidata for database split as Resolved.

This has been done.
Read only started: 06:01
Read only finished: 06:14

Tue, Jan 9, 6:30 AM · User-notice, Patch-For-Review, Wikidata, Community-Liaisons (Jan-Mar-2018)
Marostegui closed T181645: Help communicate read-only time for dewiki and wikidata for database split, a subtask of T177208: Provide dedicated database resources for wikidata, as Resolved.
Tue, Jan 9, 6:30 AM · Patch-For-Review, Operations, Goal
Marostegui closed T184285: Degraded RAID on db2055 as Resolved.

Thanks!

root@db2055:~# hpssacli controller all show config
Tue, Jan 9, 5:18 AM · DBA, Operations, ops-codfw
RandomDSdevel awarded T181645: Help communicate read-only time for dewiki and wikidata for database split a Grey Medal token.
Tue, Jan 9, 1:42 AM · User-notice, Patch-For-Review, Wikidata, Community-Liaisons (Jan-Mar-2018)

Mon, Jan 8

Marostegui moved T184464: Degraded RAID on db2060 from Triage to In progress on the DBA board.
Mon, Jan 8, 5:44 PM · DBA, Operations, ops-codfw
Marostegui added a project to T184464: Degraded RAID on db2060: DBA.
Mon, Jan 8, 5:44 PM · DBA, Operations, ops-codfw
Marostegui assigned T184464: Degraded RAID on db2060 to Papaul.

I am raising this to High Priority because the warranty expires 14th Jan 2018

Mon, Jan 8, 5:43 PM · DBA, Operations, ops-codfw
Marostegui moved T183486: MCR schema migration stage 0: create tables from Triage to Blocked external/Not db team on the DBA board.
Mon, Jan 8, 4:22 PM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Multi-Content-Revisions, DBA, Schema-change, Structured-Data-Commons, Wikidata
Marostegui moved T184446: Configure Toolforge replica views and dumps for the new MCR tables from Triage to Blocked external/Not db team on the DBA board.
Mon, Jan 8, 4:21 PM · Dumps-Generation, Data-Services, DBA, MediaWiki-Platform-Team
Marostegui added a comment to T183486: MCR schema migration stage 0: create tables.

DBA: Since creating tables isn't classified as a schema change actually needing DBA intervention, do you have any objection to me creating the four tables shown in the diff at https://gerrit.wikimedia.org/r/#/c/378724/35/maintenance/tables.sql on all wikis? Or, if you'd rather do it yourselves, feel free.

Mon, Jan 8, 4:13 PM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Multi-Content-Revisions, DBA, Schema-change, Structured-Data-Commons, Wikidata
Marostegui updated the task description for T174569: Schema change for refactored comment storage.
Mon, Jan 8, 10:46 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui added a comment to T174569: Schema change for refactored comment storage.

s7 master is done

Mon, Jan 8, 10:46 AM · MediaWiki-Platform-Team (MWPT-Q3-Jan-Mar-2018), Patch-For-Review, Dumps-Generation, Data-Services, Blocked-on-schema-change, DBA
Marostegui added a comment to T162807: Run pt-table-checksum on s1 (enwiki).

First iteration reveals drifts on:

archive
change_tag
oldimage
ores_classification
tag_summary
text
user_newtalk
Mon, Jan 8, 10:45 AM · Patch-For-Review, DBA
Marostegui added a comment to T183758: Create backups of user tables from decommissioned database servers.

I have backuped all the _p databases listed on T183758#3879024

ls -lh labsdb1003/
total 4.8G
-rw-r--r-- 1 root root  5.9M Jan  8 08:48 p50380g50491__rlrl_cawiki_p.sql.gz
-rw-r--r-- 1 root root   43M Jan  8 08:48 p50380g50491__rlrl_frwiki_p.sql.gz
-rw-r--r-- 1 root root   32K Jan  8 08:49 p50380g50491__rlrl_lvwiki_p.sql.gz
-rw-r--r-- 1 root root  1.7M Jan  8 08:49 p50380g50491__unlikely_lvwiki_p.sql.gz
-rw-r--r-- 1 root root  606M Jan  8 08:52 p50380g50692__DPL_p.sql.gz
-rw-r--r-- 1 root root   491 Jan  8 08:52 p50380g50718__u_p50380g50718.sql.gz
-rw-r--r-- 1 root root   40M Jan  8 08:54 p50380g50921__ghel_p_s3.sql.gz
-rw-r--r-- 1 root root   48M Jan  8 08:54 p50380g50921__ghel_p_s7.sql.gz
-rw-r--r-- 1 root root  523M Jan  8 08:53 p50380g50921__ghel_p.sql.gz
-rw-r--r-- 1 root root  148M Jan  8 08:58 p50380g50921__wma_p_s3.sql.gz
-rw-r--r-- 1 root root  123M Jan  8 08:58 p50380g50921__wma_p_s7.sql.gz
-rw-r--r-- 1 root root 1015M Jan  8 08:57 p50380g50921__wma_p.sql.gz
-rw-r--r-- 1 root root 1001M Jan  8 09:00 s51072__dwl_p.sql.gz
-rw-r--r-- 1 root root   97K Jan  8 09:00 s51111__common_p.sql.gz
-rw-r--r-- 1 root root  5.9M Jan  8 09:01 s51111__rlrl_cawiki_p.sql.gz
-rw-r--r-- 1 root root  122M Jan  8 09:01 s51111__rlrl_frwiki_p.sql.gz
-rw-r--r-- 1 root root   32K Jan  8 09:01 s51111__rlrl_lvwiki_p.sql.gz
-rw-r--r-- 1 root root  1.7M Jan  8 09:01 s51111__unlikely_lvwiki_p.sql.gz
-rw-r--r-- 1 root root   24M Jan  8 09:02 s51206__pt.sql.gz
-rw-r--r-- 1 root root  894K Jan  8 09:02 s51430__bswiki_first_page_revisions_p.sql.gz
-rw-r--r-- 1 root root  590K Jan  8 09:02 s51892_toolserverdb_p_s3.sql.gz
-rw-r--r-- 1 root root  590K Jan  8 09:02 s51892_toolserverdb_p_s7.sql.gz
-rw-r--r-- 1 root root  590K Jan  8 09:02 s51892_toolserverdb_p.sql.gz
-rw-r--r-- 1 root root   494 Jan  8 09:02 s52481__citationhunt_pl.sql.gz
-rw-r--r-- 1 root root   494 Jan  8 09:02 s52481__citationhunt_pt.sql.gz
-rw-r--r-- 1 root root   483 Jan  8 09:02 s52690__p.sql.gz
-rw-r--r-- 1 root root  3.6M Jan  8 09:02 s52861__bwAPI_p.sql.gz
-rw-r--r-- 1 root root  1.4K Jan  8 09:02 s53012__fikarummet_p.sql.gz
-rw-r--r-- 1 root root   16K Jan  8 09:02 u2170__meta_p.sql.gz
-rw-r--r-- 1 root root  2.6M Jan  8 09:02 u2718__bswiki_first_page_revisions_p.sql
-rw-r--r-- 1 root root  795K Jan  8 09:18 u2718__bswiki_first_page_revisions_p.sql.gz
-rw-r--r-- 1 root root  726M Jan  8 09:03 u2815__delete_p.sql.gz
-rw-r--r-- 1 root root  466M Jan  8 09:05 u2815__p.sql.gz
Mon, Jan 8, 10:13 AM · cloud-services-team (Kanban), Data-Services, DBA
Marostegui added a comment to T184160: db1059 BBU issues.

˜/icinga-wm 10:13> PROBLEM - MegaRAID on db1059 is CRITICAL: CRITICAL: 1 LD(s) must have write cache policy WriteBack, currently using: WriteThrough

Mon, Jan 8, 9:13 AM · ops-eqiad, DBA, Operations
Marostegui triaged T184401: db1011 possibly faulty BBU as Normal priority.
Mon, Jan 8, 8:32 AM · DBA
Marostegui created T184401: db1011 possibly faulty BBU.
Mon, Jan 8, 8:32 AM · DBA
Marostegui added a project to T184262: Decommission db1039: hardware-requests.
Mon, Jan 8, 8:10 AM · hardware-requests, ops-eqiad, Operations, Patch-For-Review, DBA
Marostegui added a subtask for T134476: Decommission old coredb machines (<=db1050): T184397: Decommission db1030.
Mon, Jan 8, 8:05 AM · Patch-For-Review, Goal, Operations, DBA
Marostegui added a parent task for T184397: Decommission db1030: T134476: Decommission old coredb machines (<=db1050).
Mon, Jan 8, 8:05 AM · Patch-For-Review, DBA
Marostegui triaged T184397: Decommission db1030 as Normal priority.
Mon, Jan 8, 8:05 AM · Patch-For-Review, DBA