TZ: UTC +1/+2
User Details
- User Since
- Sep 1 2016, 6:48 AM (402 w, 5 d)
- Availability
- Available
- IRC Nick
- marostegui
- LDAP User
- Marostegui
- MediaWiki User
- MArostegui (WMF) [ Global Accounts ]
Yesterday
Excellent work. This is going to simplify a lot our pc operations. Thanks everyone for making this a reality!
Completed
cumin2024@db1154.eqiad.wmnet[(none)]> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: db1196.eqiad.wmnet Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: db1196-bin.003828 Read_Master_Log_Pos: 103754372 Relay_Log_File: db1154-relay-bin.000403 Relay_Log_Pos: 103754211 Relay_Master_Log_File: db1196-bin.003828 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: mysql.%,oai.%,advisorswiki.%,arbcom_cswiki.%,arbcom_dewiki.%,arbcom_enwiki.%,arbcom_fiwiki.%,arbcom_nlwiki.%,arbcom_ruwiki.%,auditcomwiki.%,boardgovcomwiki.%,boardwiki.%,chairwiki.%,chapcomwiki.%,checkuserwiki.%,collabwiki.%,ecwikimedia.%,electcomwiki.%,execwiki.%,fdcwiki.%,grantswiki.%,id_internalwikimedia.%,iegcomwiki.%,ilwikimedia.%,internalwiki.%,legalteamwiki.%,movementroleswiki.%,noboard_chapterswikimedia.%,officewiki.%,ombudsmenwiki.%,otrs_wikiwiki.%,projectcomwiki.%,searchcomwiki.%,spcomwiki.%,stewardwiki.%,sysop_itwiki.%,sysop_plwiki.%,techconductwiki.%,transitionteamwiki.%,wg_enwiki.%,wikimaniateamwiki.%,zerowiki.%,%.__wmf_checksums,%.accountaudit_login,%.arbcom1_vote,%.archive_old,%.blob_orphans,%.blob_tracking,%.bot_passwords,%.bv2009_edits,%.categorylinks_old,%.click_tracking,%.cu_useragent,%.cu_changes,%.cu_log,%.cu_log_event,%.cu_private_event,%.cu_useragent_clienthints,%.cu_useragent_clienthints_map,%.cur,%.discussiontools_subscription,%.echo_email_batch,%.echo_event,%.echo_target_page,%.echo_unread_wikis,%.echo_notification,%.echo_push_subscription,%.edit_page_tracking,%.email_capture,%.exarchive,%.exrevision,%.globalnames,%.growthexperiments_link_recommendations,%.growthexperiments_link_submissions,%.growthexperiments_mentor_mentee,%.growthexperiments_mentee_data,%.growthexperiments_user_impact,%.hidden,%.image_old,%.ipinfo_ip_changes,%.job,%.ldap_domains,%.linkscc,%.localnames,%.log_search,%.logging_old,%.long_run_profiling,%.mediamoderation_scan,%.migrateuser_medium,%.moodbar_feedback,%.moodbar_feedback_response,%.msg_resource,%.oauth_accepted_consumer,%.oathauth_devices,%.oauth_ratelimit_client_tier,%.oauth_registered_consumer,%.oathauth_types,%.oauth2_access_tokens,%.objectcache,%.old_growth,%.oldimage_old,%.optin_survey,%.prefstats,%.prefswitch_survey,%.profiling,%.querycache,%.querycache_info,%.querycache_old,%.querycachetwo,%.reading_list,%.reading_list_entry,%.securepoll_cookie_match,%.securepoll_elections,%.securepoll_entity,%.securepoll_lists,%.securepoll_msgs,%.securepoll_options,%.securepoll_properties,%.securepoll_questions,%.securepoll_strike,%.securepoll_voters,%.securepoll_votes,%.spoofuser,%.text,%.titlekey,%.transcache,%.translate_cache,%.uploadstash,%.urlshortcodes,%.user_newtalk,%.vote_log,%.watchlist,%.watchlist_expiry,%.wikimedia_editor_tasks_counts,%.wikimedia_editor_tasks_keys,%.wikimedia_editor_tasks_targets_passed Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 103753911 Relay_Log_Space: 103755030 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: Yes Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 172000011 Master_SSL_Crl: Master_SSL_Crlpath: Using_Gtid: Slave_Pos Gtid_IO_Pos: 180355171-180355171-148310907,180359172-180359172-49702203,171970637-171970637-2116621969,171978826-171978826-931149646,180363268-180363268-3447080256,171978768-171978768-202416,171970745-171970745-3651346146,171978774-171978774-5,180359179-180359179-96523837,171970572-171970572-3935877275,171970661-171970661-3655324752,171978777-171978777-514400352,171970704-171970704-351094624,171974720-171974720-2572451842,180355190-180355190-1378262411,0-171970637-5484646134,172000011-172000011-21,171974884-171974884-1473084269 Replicate_Do_Domain_Ids: Replicate_Ignore_Domain_Ids: Parallel_Mode: optimistic SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: closing tables Slave_DDL_Groups: 3 Slave_Non_Transactional_Groups: 0 Slave_Transactional_Groups: 171686624 1 row in set (0.001 sec)
use enwiki; show triggers;
Once reviewed I will get this deployed probably directly on masters with replication as the table is small.
By default /srv is preserved
@BTullis clouddb1021 belongs to your team, so could you take care of that one?
According to the wiki replicas responsibilities documents, OS upgrades are not performed by Data-Persistence but we would be happy to help if something needs our attention.
One more test in db1191:
This is all done
Mon, May 20
s7 frwikitionary is done directly on the master with replication enabled, metawiki is a bit bigger and will require replica by replica schema change as the schema change there takes around 1 min.
s8 wikidata is done with replication directly on the master
It was a password getting out of sync, I restarted it with:
racadm set iDRAC.Users.2.Password XXXX
It looks bad indeed
[960563.875753] megaraid_sas 0000:18:00.0: 2151 (769510480s/0x0004/CRIT) - Enclosure PD 20(c None/p1) phy bad for slot 4
Compiled 10.6.18 and testing the package installation etc on db1125
root@db1125:~# /opt/wmf-mariadb106/bin/mysqld --version /opt/wmf-mariadb106/bin/mysqld Ver 10.6.18-MariaDB-log for Linux on x86_64 (MariaDB Server)
Thanks @Scott_French - what I have done is:
Running this on the old s8 master (db2161)
The pending schema change on the old master will be tracked in this task T364299
db2150 looking okay too after the change:
cumin2024@db2150.codfw.wmnet[centralauth]> FLUSH STATUS; pager cat > /dev/null; SELECT /* MediaWiki\Extension\GlobalBlocking\GlobalBlocking::getGlobalBlockingBlock */ gb_id,gb_address,gb_by,gb_by_wiki,gb_reason,gb_timestamp,gb_anon_only,gb_expiry,gb_range_start,gb_range_end FROM `globalblocks` WHERE (gb_range_start LIKE 'v6-2%' ESCAPE '`' ) AND (gb_range_start <= 'v6-2A02C207204105070000000000000099') AND (gb_range_end >= 'v6-2A02C207204105070000000000000099') AND (gb_expiry > '20220505113815') ; nopager; SHOW STATUS like 'Hand%'; Query OK, 0 rows affected (0.032 sec)
Let know us know when the wiki is created so we can sanitize it
I've not see any slow queries logged on globalblocks table for now. I am going to alter this on a few more hosts and wait a little bit longer again.
A new 10.4 version has been also relased, but given that we only have a few hosts running 10.4 (mostly s1 - T364290), I won't be compiling it.
Fri, May 17
Enabled slow query log in db2122 to capture queries that take longer than 10 seconds.
So s7 is fully on 10.6, and I've tried again on db2122:
cumin2024@db2122.codfw.wmnet[centralauth]> ALTER TABLE globalblocks CHANGE gb_timestamp gb_timestamp varchar(14) NOT NULL, CHANGE gb_expiry gb_expiry VARBINARY(14) DEFAULT '' NOT NULL; Query OK, 0 rows affected (0.036 sec) Records: 0 Duplicates: 0 Warnings: 0
All done
Still running in s8 codfw. The schema change takes around 11h per host there.
Thu, May 16
Ok I believe all the steps are done.
While trying to add pc1011 and pc2011 to s1, we got dbctl errors:
Object dbconfig failed to validate: 'pc1' does not match any of the regexes: 'DEFAULT', '^s1[01]$', '^s[124-8]$' The actual value was: {'externalLoads': defaultdict(<function DbConfig.compute_config.<locals>.<lambda> at 0x7f486cd72550>,