Page MenuHomePhabricator

Marostegui (Manuel Aróstegui)
Staff Database Administrator

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Friday

  • Clear sailing ahead.

User Details

User Since
Sep 1 2016, 6:48 AM (402 w, 5 d)
Availability
Available
IRC Nick
marostegui
LDAP User
Marostegui
MediaWiki User
MArostegui (WMF) [ Global Accounts ]

TZ: UTC +1/+2

Recent Activity

Yesterday

Marostegui added a project to T365217: Degraded RAID on backup2010: Data-Persistence-Backup.
Tue, May 21, 7:32 PM · Data-Persistence-Backup, DC-Ops, Data-Persistence, SRE, ops-codfw
Marostegui added a comment to T362786: Enable dbctl for parsercache.

Excellent work. This is going to simplify a lot our pc operations. Thanks everyone for making this a reality!

Tue, May 21, 4:31 PM · Patch-For-Review, Infrastructure-Foundations, Data-Persistence, conftool
Marostegui closed T365465: translate_reviews unsigned and revision big int as Resolved.

Completed

Tue, May 21, 2:22 PM · Schema-change-in-production, DBA
Marostegui closed T365465: translate_reviews unsigned and revision big int, a subtask of T365445: Investigate potential signed int references to rev_id, as Resolved.
Tue, May 21, 2:22 PM · BlueSpice, Schema-change, Patch-For-Review, MathSearch, MediaWiki-extensions-Translate, Wikidata
Marostegui updated the task description for T365465: translate_reviews unsigned and revision big int.
Tue, May 21, 2:22 PM · Schema-change-in-production, DBA
Marostegui added a comment to P62779 Masterwork From Distant Lands.
cumin2024@db1154.eqiad.wmnet[(none)]> show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: db1196.eqiad.wmnet
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: db1196-bin.003828
           Read_Master_Log_Pos: 103754372
                Relay_Log_File: db1154-relay-bin.000403
                 Relay_Log_Pos: 103754211
         Relay_Master_Log_File: db1196-bin.003828
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table: mysql.%,oai.%,advisorswiki.%,arbcom_cswiki.%,arbcom_dewiki.%,arbcom_enwiki.%,arbcom_fiwiki.%,arbcom_nlwiki.%,arbcom_ruwiki.%,auditcomwiki.%,boardgovcomwiki.%,boardwiki.%,chairwiki.%,chapcomwiki.%,checkuserwiki.%,collabwiki.%,ecwikimedia.%,electcomwiki.%,execwiki.%,fdcwiki.%,grantswiki.%,id_internalwikimedia.%,iegcomwiki.%,ilwikimedia.%,internalwiki.%,legalteamwiki.%,movementroleswiki.%,noboard_chapterswikimedia.%,officewiki.%,ombudsmenwiki.%,otrs_wikiwiki.%,projectcomwiki.%,searchcomwiki.%,spcomwiki.%,stewardwiki.%,sysop_itwiki.%,sysop_plwiki.%,techconductwiki.%,transitionteamwiki.%,wg_enwiki.%,wikimaniateamwiki.%,zerowiki.%,%.__wmf_checksums,%.accountaudit_login,%.arbcom1_vote,%.archive_old,%.blob_orphans,%.blob_tracking,%.bot_passwords,%.bv2009_edits,%.categorylinks_old,%.click_tracking,%.cu_useragent,%.cu_changes,%.cu_log,%.cu_log_event,%.cu_private_event,%.cu_useragent_clienthints,%.cu_useragent_clienthints_map,%.cur,%.discussiontools_subscription,%.echo_email_batch,%.echo_event,%.echo_target_page,%.echo_unread_wikis,%.echo_notification,%.echo_push_subscription,%.edit_page_tracking,%.email_capture,%.exarchive,%.exrevision,%.globalnames,%.growthexperiments_link_recommendations,%.growthexperiments_link_submissions,%.growthexperiments_mentor_mentee,%.growthexperiments_mentee_data,%.growthexperiments_user_impact,%.hidden,%.image_old,%.ipinfo_ip_changes,%.job,%.ldap_domains,%.linkscc,%.localnames,%.log_search,%.logging_old,%.long_run_profiling,%.mediamoderation_scan,%.migrateuser_medium,%.moodbar_feedback,%.moodbar_feedback_response,%.msg_resource,%.oauth_accepted_consumer,%.oathauth_devices,%.oauth_ratelimit_client_tier,%.oauth_registered_consumer,%.oathauth_types,%.oauth2_access_tokens,%.objectcache,%.old_growth,%.oldimage_old,%.optin_survey,%.prefstats,%.prefswitch_survey,%.profiling,%.querycache,%.querycache_info,%.querycache_old,%.querycachetwo,%.reading_list,%.reading_list_entry,%.securepoll_cookie_match,%.securepoll_elections,%.securepoll_entity,%.securepoll_lists,%.securepoll_msgs,%.securepoll_options,%.securepoll_properties,%.securepoll_questions,%.securepoll_strike,%.securepoll_voters,%.securepoll_votes,%.spoofuser,%.text,%.titlekey,%.transcache,%.translate_cache,%.uploadstash,%.urlshortcodes,%.user_newtalk,%.vote_log,%.watchlist,%.watchlist_expiry,%.wikimedia_editor_tasks_counts,%.wikimedia_editor_tasks_keys,%.wikimedia_editor_tasks_targets_passed
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 103753911
               Relay_Log_Space: 103755030
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: Yes
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 172000011
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 180355171-180355171-148310907,180359172-180359172-49702203,171970637-171970637-2116621969,171978826-171978826-931149646,180363268-180363268-3447080256,171978768-171978768-202416,171970745-171970745-3651346146,171978774-171978774-5,180359179-180359179-96523837,171970572-171970572-3935877275,171970661-171970661-3655324752,171978777-171978777-514400352,171970704-171970704-351094624,171974720-171974720-2572451842,180355190-180355190-1378262411,0-171970637-5484646134,172000011-172000011-21,171974884-171974884-1473084269
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: closing tables
              Slave_DDL_Groups: 3
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 171686624
1 row in set (0.001 sec)
Tue, May 21, 1:58 PM
Marostegui added a comment to P62779 Masterwork From Distant Lands.

use enwiki; show triggers;

Tue, May 21, 1:57 PM
Marostegui edited P62770 (An Untitled Masterwork).
Tue, May 21, 1:10 PM
Marostegui updated the task description for T365465: translate_reviews unsigned and revision big int.
Tue, May 21, 1:10 PM · Schema-change-in-production, DBA
Marostegui updated the task description for T365465: translate_reviews unsigned and revision big int.
Tue, May 21, 1:09 PM · Schema-change-in-production, DBA
Marostegui updated the task description for T365465: translate_reviews unsigned and revision big int.
Tue, May 21, 1:07 PM · Schema-change-in-production, DBA
Marostegui updated the task description for T365465: translate_reviews unsigned and revision big int.
Tue, May 21, 1:06 PM · Schema-change-in-production, DBA
Marostegui updated the task description for T365465: translate_reviews unsigned and revision big int.
Tue, May 21, 12:02 PM · Schema-change-in-production, DBA
Marostegui created P62770 (An Untitled Masterwork).
Tue, May 21, 12:02 PM
Marostegui triaged T365465: translate_reviews unsigned and revision big int as Medium priority.

Once reviewed I will get this deployed probably directly on masters with replication as the table is small.

Tue, May 21, 11:59 AM · Schema-change-in-production, DBA
Marostegui moved T365465: translate_reviews unsigned and revision big int from Triage to In progress on the DBA board.
Tue, May 21, 11:56 AM · Schema-change-in-production, DBA
Marostegui claimed T365465: translate_reviews unsigned and revision big int.
Tue, May 21, 11:54 AM · Schema-change-in-production, DBA
Marostegui added a comment to T365424: Upgrade clouddb* hosts to Bookworm.

I can do the reimages for the WMCS hosts.

A few questions:

Stop mariadb (instance by instance, DO NOT DO systemctl stop mariadb@s*)

What do you mean with "instance by instance"? Avoiding s* and instead running systemctl stop on one unit at a time (e.g. @s1 then @s2)?

Tue, May 21, 11:47 AM · cloud-services-team (FY2023/2024-Q3-Q4), Data-Persistence, Data-Services
Marostegui added a comment to T365450: Upgrade clouddb1021 to bookworm.

By default /srv is preserved

Tue, May 21, 9:48 AM · Data-Platform-SRE, Data-Persistence, Data-Services
Marostegui updated the task description for T362746: Upgrade s4 to MariaDB 10.6.
Tue, May 21, 8:10 AM · DBA
Marostegui updated subscribers of T365424: Upgrade clouddb* hosts to Bookworm.

@BTullis clouddb1021 belongs to your team, so could you take care of that one?

Tue, May 21, 6:50 AM · cloud-services-team (FY2023/2024-Q3-Q4), Data-Persistence, Data-Services
Marostegui created T365426: Upgrade db1208 to bookworm.
Tue, May 21, 6:49 AM · Data-Platform-SRE
Marostegui added a comment to T365424: Upgrade clouddb* hosts to Bookworm.

According to the wiki replicas responsibilities documents, OS upgrades are not performed by Data-Persistence but we would be happy to help if something needs our attention.

Tue, May 21, 6:42 AM · cloud-services-team (FY2023/2024-Q3-Q4), Data-Persistence, Data-Services
Marostegui created T365424: Upgrade clouddb* hosts to Bookworm.
Tue, May 21, 6:39 AM · cloud-services-team (FY2023/2024-Q3-Q4), Data-Persistence, Data-Services
Marostegui updated the task description for T307501: Adjust the field type of globalblocks timestamp columns to fixed binary on wmf wikis.
Tue, May 21, 6:22 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), SRE-OnFire, Schema-change-in-production, DBA
Marostegui added a comment to T307501: Adjust the field type of globalblocks timestamp columns to fixed binary on wmf wikis.

One more test in db1191:

Tue, May 21, 6:22 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), SRE-OnFire, Schema-change-in-production, DBA
Marostegui committed rSCHCHeff50b61554c: Update change_gb_timestamp_T307501.py.
Update change_gb_timestamp_T307501.py
Tue, May 21, 6:19 AM
Marostegui committed rSCHCH3ae000306920: change_gb_timestamp_T307501.py: Schema change.
change_gb_timestamp_T307501.py: Schema change
Tue, May 21, 6:17 AM
Marostegui closed T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki as Resolved.

This is all done

Tue, May 21, 5:53 AM · Schema-change-in-production, DBA
Marostegui closed T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki, a subtask of T364681: Issues with translatable pages on Wikidata due to revision id overflow, as Resolved.
Tue, May 21, 5:52 AM · Wikimedia-production-error, MW-1.43-notes (1.43.0-wmf.6; 2024-05-21), Unplanned-Sprint-Work, Wikidata, Schema-change, Localization Infrastructure FY2023-24, Language-Team (Language-2024-April-June), Regression, MediaWiki-extensions-Translate
Marostegui updated the task description for T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Tue, May 21, 5:52 AM · Schema-change-in-production, DBA
Marostegui committed rSCHCH83e91814c874: Merge branch 'T365352' into 'main'.
Merge branch 'T365352' into 'main'
Tue, May 21, 5:45 AM

Mon, May 20

Marostegui updated the task description for T364299: Make rc_id a bigint.
Mon, May 20, 4:29 PM · Schema-change-in-production, DBA
Marostegui committed rSCHCH21ddf406be7a: Update change_rt_page_T365352.py.
Update change_rt_page_T365352.py
Mon, May 20, 2:11 PM
Marostegui committed rSCHCHaee6f8c07c12: Update change_rt_page_T365352.py.
Update change_rt_page_T365352.py
Mon, May 20, 2:11 PM
Marostegui updated the task description for T364299: Make rc_id a bigint.
Mon, May 20, 1:26 PM · Schema-change-in-production, DBA
Marostegui committed rSCHCH4286f3de9ee8: change_rt_page_T365352.py: New schema change.
change_rt_page_T365352.py: New schema change
Mon, May 20, 1:20 PM
Marostegui added a subtask for T362786: Enable dbctl for parsercache: T365356: Document new parsercache failover process.
Mon, May 20, 1:12 PM · Patch-For-Review, Infrastructure-Foundations, Data-Persistence, conftool
Marostegui added a parent task for T365356: Document new parsercache failover process: T362786: Enable dbctl for parsercache.
Mon, May 20, 1:12 PM · DBA
Marostegui updated the task description for T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Mon, May 20, 12:53 PM · Schema-change-in-production, DBA
Marostegui added a comment to T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.

s7 frwikitionary is done directly on the master with replication enabled, metawiki is a bit bigger and will require replica by replica schema change as the schema change there takes around 1 min.

Mon, May 20, 12:50 PM · Schema-change-in-production, DBA
abi_ awarded T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki a Like token.
Mon, May 20, 12:28 PM · Schema-change-in-production, DBA
Marostegui triaged T365356: Document new parsercache failover process as Medium priority.
Mon, May 20, 12:15 PM · DBA
Marostegui created T365356: Document new parsercache failover process.
Mon, May 20, 12:15 PM · DBA
Marostegui updated the task description for T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Mon, May 20, 12:02 PM · Schema-change-in-production, DBA
Marostegui updated the task description for T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Mon, May 20, 11:57 AM · Schema-change-in-production, DBA
Marostegui updated the task description for T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Mon, May 20, 11:54 AM · Schema-change-in-production, DBA
Marostegui updated the task description for T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Mon, May 20, 11:52 AM · Schema-change-in-production, DBA
Marostegui updated the task description for T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Mon, May 20, 11:50 AM · Schema-change-in-production, DBA
Marostegui updated the task description for T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Mon, May 20, 11:48 AM · Schema-change-in-production, DBA
Marostegui lowered the priority of T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki from Unbreak Now! to Medium.
Mon, May 20, 11:48 AM · Schema-change-in-production, DBA
Marostegui added a comment to T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.

s8 wikidata is done with replication directly on the master

Mon, May 20, 11:48 AM · Schema-change-in-production, DBA
Marostegui updated the task description for T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Mon, May 20, 11:46 AM · Schema-change-in-production, DBA
Marostegui moved T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki from Triage to In progress on the DBA board.
Mon, May 20, 11:46 AM · Schema-change-in-production, DBA
Marostegui claimed T365352: Stop referencing rev_id as signed int in revtag table to counter revision id overflow in wikidatawiki.
Mon, May 20, 11:43 AM · Schema-change-in-production, DBA
Marostegui closed T365351: Reset db2181 idrac as Resolved.

It was a password getting out of sync, I restarted it with:

racadm set iDRAC.Users.2.Password XXXX
Mon, May 20, 11:37 AM · SRE, ops-codfw, DBA
Marostegui created P62686 (An Untitled Masterwork).
Mon, May 20, 11:31 AM
Marostegui triaged T365351: Reset db2181 idrac as Medium priority.
Mon, May 20, 11:21 AM · SRE, ops-codfw, DBA
Marostegui updated subscribers of T365351: Reset db2181 idrac.
Mon, May 20, 11:20 AM · SRE, ops-codfw, DBA
Marostegui created T365351: Reset db2181 idrac.
Mon, May 20, 11:20 AM · SRE, ops-codfw, DBA
Marostegui triaged T365346: Degraded RAID on db1172 as Medium priority.
Mon, May 20, 9:21 AM · DBA, SRE, ops-eqiad
Marostegui added a project to T365346: Degraded RAID on db1172: DBA.

It looks bad indeed

[960563.875753] megaraid_sas 0000:18:00.0: 2151 (769510480s/0x0004/CRIT) - Enclosure PD 20(c None/p1) phy bad for slot 4
Mon, May 20, 9:20 AM · DBA, SRE, ops-eqiad
Marostegui added a comment to T365338: MariaDB 10.6.18 released.

Compiled 10.6.18 and testing the package installation etc on db1125

root@db1125:~# /opt/wmf-mariadb106/bin/mysqld --version
/opt/wmf-mariadb106/bin/mysqld  Ver 10.6.18-MariaDB-log for Linux on x86_64 (MariaDB Server)
Mon, May 20, 8:57 AM · DBA
Marostegui added a comment to T362786: Enable dbctl for parsercache.

Thanks @Scott_French - what I have done is:

Mon, May 20, 7:16 AM · Patch-For-Review, Infrastructure-Foundations, Data-Persistence, conftool
Marostegui added a comment to T364299: Make rc_id a bigint.

Running this on the old s8 master (db2161)

Mon, May 20, 6:34 AM · Schema-change-in-production, DBA
Marostegui added a comment to T365339: Switchover s8 master (db2161 -> db2165).

The pending schema change on the old master will be tracked in this task T364299

Mon, May 20, 6:03 AM · DBA
Marostegui closed T365339: Switchover s8 master (db2161 -> db2165) as Resolved.
Mon, May 20, 6:03 AM · DBA
Marostegui closed T365339: Switchover s8 master (db2161 -> db2165), a subtask of T364299: Make rc_id a bigint, as Resolved.
Mon, May 20, 6:03 AM · Schema-change-in-production, DBA
Marostegui updated the task description for T365339: Switchover s8 master (db2161 -> db2165).
Mon, May 20, 6:02 AM · DBA
Marostegui updated the task description for T365339: Switchover s8 master (db2161 -> db2165).
Mon, May 20, 6:00 AM · DBA
Marostegui updated the task description for T365339: Switchover s8 master (db2161 -> db2165).
Mon, May 20, 5:57 AM · DBA
Marostegui added a comment to T307501: Adjust the field type of globalblocks timestamp columns to fixed binary on wmf wikis.

db2150 looking okay too after the change:

cumin2024@db2150.codfw.wmnet[centralauth]> FLUSH STATUS; pager cat > /dev/null; SELECT /* MediaWiki\Extension\GlobalBlocking\GlobalBlocking::getGlobalBlockingBlock  */  gb_id,gb_address,gb_by,gb_by_wiki,gb_reason,gb_timestamp,gb_anon_only,gb_expiry,gb_range_start,gb_range_end  FROM `globalblocks`    WHERE (gb_range_start  LIKE 'v6-2%' ESCAPE '`' ) AND (gb_range_start <= 'v6-2A02C207204105070000000000000099') AND (gb_range_end >= 'v6-2A02C207204105070000000000000099') AND (gb_expiry > '20220505113815')  ; nopager; SHOW STATUS like 'Hand%';
Query OK, 0 rows affected (0.032 sec)
Mon, May 20, 5:53 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), SRE-OnFire, Schema-change-in-production, DBA
Marostegui updated the task description for T307501: Adjust the field type of globalblocks timestamp columns to fixed binary on wmf wikis.
Mon, May 20, 5:50 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), SRE-OnFire, Schema-change-in-production, DBA
Marostegui added a parent task for T365339: Switchover s8 master (db2161 -> db2165): T364299: Make rc_id a bigint.
Mon, May 20, 5:46 AM · DBA
Marostegui added a subtask for T364299: Make rc_id a bigint: T365339: Switchover s8 master (db2161 -> db2165).
Mon, May 20, 5:46 AM · Schema-change-in-production, DBA
Marostegui reopened T365229: Prepare and check storage layer for dtpwiki as "Open".
Mon, May 20, 5:46 AM · Data-Services, DBA
Marostegui reopened T365229: Prepare and check storage layer for dtpwiki, a subtask of T365220: Create Wikipedia Central Dusun, as Open.
Mon, May 20, 5:45 AM · MW-1.43-notes (1.43.0-wmf.6; 2024-05-21), Wiki-Setup (Create)
Marostegui closed T365229: Prepare and check storage layer for dtpwiki, a subtask of T365220: Create Wikipedia Central Dusun, as Resolved.
Mon, May 20, 5:45 AM · MW-1.43-notes (1.43.0-wmf.6; 2024-05-21), Wiki-Setup (Create)
Marostegui closed T365229: Prepare and check storage layer for dtpwiki as Resolved.

Let know us know when the wiki is created so we can sanitize it

Mon, May 20, 5:45 AM · Data-Services, DBA
Marostegui claimed T365339: Switchover s8 master (db2161 -> db2165).
Mon, May 20, 5:37 AM · DBA
Marostegui updated the task description for T365339: Switchover s8 master (db2161 -> db2165).
Mon, May 20, 5:36 AM · DBA
Marostegui updated the task description for T307501: Adjust the field type of globalblocks timestamp columns to fixed binary on wmf wikis.
Mon, May 20, 5:31 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), SRE-OnFire, Schema-change-in-production, DBA
Marostegui added a comment to T307501: Adjust the field type of globalblocks timestamp columns to fixed binary on wmf wikis.

I've not see any slow queries logged on globalblocks table for now. I am going to alter this on a few more hosts and wait a little bit longer again.

Mon, May 20, 5:22 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), SRE-OnFire, Schema-change-in-production, DBA
Marostegui moved T364290: Upgrade s1 to MariaDB 10.6 from Ready to In progress on the DBA board.
Mon, May 20, 5:20 AM · DBA
Marostegui triaged T365338: MariaDB 10.6.18 released as Medium priority.

A new 10.4 version has been also relased, but given that we only have a few hosts running 10.4 (mostly s1 - T364290), I won't be compiling it.

Mon, May 20, 5:14 AM · DBA
Marostegui created T365338: MariaDB 10.6.18 released.
Mon, May 20, 5:12 AM · DBA
Marostegui updated the task description for T364299: Make rc_id a bigint.
Mon, May 20, 5:11 AM · Schema-change-in-production, DBA

Fri, May 17

Marostegui added a project to T365213: Degraded RAID on es2022: DBA.
Fri, May 17, 7:30 AM · DC-Ops, DBA, SRE, ops-codfw
Marostegui added a comment to T307501: Adjust the field type of globalblocks timestamp columns to fixed binary on wmf wikis.

Enabled slow query log in db2122 to capture queries that take longer than 10 seconds.

Fri, May 17, 6:44 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), SRE-OnFire, Schema-change-in-production, DBA
Marostegui changed the status of T307501: Adjust the field type of globalblocks timestamp columns to fixed binary on wmf wikis from Stalled to Open.

So s7 is fully on 10.6, and I've tried again on db2122:

cumin2024@db2122.codfw.wmnet[centralauth]> ALTER TABLE   globalblocks CHANGE  gb_timestamp gb_timestamp varchar(14) NOT NULL, CHANGE  gb_expiry gb_expiry VARBINARY(14) DEFAULT '' NOT NULL;
Query OK, 0 rows affected (0.036 sec)
Records: 0  Duplicates: 0  Warnings: 0
Fri, May 17, 6:22 AM · SRE-Sprint-Week-Sustainability-March2023, Sustainability (Incident Followup), SRE-OnFire, Schema-change-in-production, DBA
Marostegui changed the status of T307501: Adjust the field type of globalblocks timestamp columns to fixed binary on wmf wikis, a subtask of T307647: 2022-05-05 Wikimedia full site outage, from Stalled to Open.
Fri, May 17, 6:21 AM · SRE-OnFire (FY2021/2022-Q4), GlobalBlocking, DBA, SRE, Wikimedia-Incident
Marostegui closed T364289: Reimage external store hosts with Bookworm as Resolved.

All done

Fri, May 17, 6:12 AM · DBA
Marostegui added a comment to T364299: Make rc_id a bigint.

Still running in s8 codfw. The schema change takes around 11h per host there.

Fri, May 17, 5:25 AM · Schema-change-in-production, DBA
Marostegui updated the task description for T364289: Reimage external store hosts with Bookworm.
Fri, May 17, 5:18 AM · DBA
Marostegui added a comment to T362786: Enable dbctl for parsercache.

Looking more closely at what's in etcd, it looks like we've gone the route of configuring a given spare instance only with the section for which it is a warm standby (e.g., pc1014 from pc1 and pc1015 from pc4 - though the latter is new to me, as I thought all spares replicate from pc1).

Fri, May 17, 5:06 AM · Patch-For-Review, Infrastructure-Foundations, Data-Persistence, conftool

Thu, May 16

Marostegui created P62504 (An Untitled Masterwork).
Thu, May 16, 1:43 PM
Marostegui created T365123: Make dbctl check for depooled future masters .
Thu, May 16, 10:45 AM · Patch-For-Review, Infrastructure-Foundations, Data-Persistence, conftool
Marostegui added a comment to T362786: Enable dbctl for parsercache.

Ok I believe all the steps are done.

Thu, May 16, 10:23 AM · Patch-For-Review, Infrastructure-Foundations, Data-Persistence, conftool
Marostegui added a comment to T362786: Enable dbctl for parsercache.

While trying to add pc1011 and pc2011 to s1, we got dbctl errors:

Object dbconfig failed to validate:
'pc1' does not match any of the regexes: 'DEFAULT', '^s1[01]$', '^s[124-8]$'
The actual value was: {'externalLoads': defaultdict(<function DbConfig.compute_config.<locals>.<lambda> at 0x7f486cd72550>,

We are investigating

Thu, May 16, 10:05 AM · Patch-For-Review, Infrastructure-Foundations, Data-Persistence, conftool
Marostegui added a comment to T362786: Enable dbctl for parsercache.

While trying to add pc1011 and pc2011 to s1, we got dbctl errors:

Object dbconfig failed to validate:
'pc1' does not match any of the regexes: 'DEFAULT', '^s1[01]$', '^s[124-8]$'
The actual value was: {'externalLoads': defaultdict(<function DbConfig.compute_config.<locals>.<lambda> at 0x7f486cd72550>,
Thu, May 16, 9:56 AM · Patch-For-Review, Infrastructure-Foundations, Data-Persistence, conftool