Thank you!
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Wed, Jun 19
Let's make this high as the error is confusing enough to make one think this is broken :)
Thanks for all this work!!
Tue, Jun 18
In T367833#9903360, @Ladsgroup wrote:This is done I think but then maybe we should drop the grant on lists1001 then?
This is completed
I have renamed this table on db1169 (enwiki) and will leave it like that for a few days:
cumin2024@db1169.eqiad.wmnet[enwiki]> show tables like 'T%'; +-----------------------+ | Tables_in_enwiki (T%) | +-----------------------+ | T367632_ipblocks | +-----------------------+ 1 row in set (0.001 sec)
codfw is now fixed.
Yes, we have that RW and RO users in other services.
@Ladsgroup is this table on all wikis?
For what is worth - not written in enwiki since Apr 8
root@db1163:/srv/sqldata/enwiki# ls -lh ipblocks.ibd -rw-rw---- 1 mysql mysql 260M Apr 8 23:33 ipblocks.ibd
We also need to include this host in zarcillo (I will do that)
Thanks @BTullis - if this host is going to be replacing clouddb1021 we need to update the documentation accordingly as there're operations that are happening in clouddb1021 that should keep happening on this host (eg: views creation etc)
In production we run 10.6. We have packages for Bullseye, but we encourage to go to Bookworm directly if possible.
This is done
There is a problem before we can even check the grants, there's no connection between those two hosts and the proxies. I guess a FW rules needs to be added somewhere:
Old 400
Mon, Jun 17
Sat, Jun 15
Fri, Jun 14
Sometimes it means the host is stuck at a memory check - should be visible onsite.
Fixed db1170 too. Leaving it like this till Monday.
This worked well on db2220 too:
cumin2024@db2220.codfw.wmnet[metawiki]> SELECT rc_id,rc_timestamp,rc_namespace,rc_title,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_source,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,rc_actor,recentchanges_actor.actor_user AS `rc_user`,recentchanges_actor.actor_name AS `rc_user_text`,comment_rc_comment.comment_text AS `rc_comment_text`,comment_rc_comment.comment_data AS `rc_comment_data`,comment_rc_comment.comment_id AS `rc_comment_cid`,rc_title,rc_namespace,wl_user,wl_notificationtimestamp,we_expiry,page_latest,(SELECT GROUP_CONCAT(ctd_name SEPARATOR ',') FROM `change_tag` JOIN `change_tag_def` ON ((ct_tag_id=ctd_id)) WHERE (ct_rc_id=rc_id) ) AS `ts_tags` FROM `recentchanges` JOIN `actor` `recentchanges_actor` ON ((actor_id=rc_actor)) STRAIGHT_JOIN `comment` `comment_rc_comment` ON ((comment_rc_comment.comment_id = rc_comment_id)) LEFT JOIN `watchlist` ON (wl_user = 2134281 AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `watchlist_expiry` ON ((wl_id = we_item)) LEFT JOIN `page` ON ((rc_cur_id=page_id)) WHERE (((actor_user IS NOT NULL))) AND rc_bot = 0 AND (rc_type != 6) AND (rc_source != 'wb') AND (rc_namespace NOT IN (1198,1199,866,867)) AND (rc_timestamp >= '20240601170235') AND rc_new IN (0,1) ORDER BY rc_timestamp DESC LIMIT 50 ; ^CCtrl-C -- query killed. Continuing normally. ERROR 1317 (70100): Query execution was interrupted
I have fixed dbstore1008:3318 by importing the index stats from db1227:
cumin2024@dbstore1008.eqiad.wmnet[metawiki]> explain SELECT rc_id,rc_timestamp,rc_namespace,rc_title,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_source,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,rc_actor,recentchanges_actor.actor_user AS `rc_user`,recentchanges_actor.actor_name AS `rc_user_text`,comment_rc_comment.comment_text AS `rc_comment_text`,comment_rc_comment.comment_data AS `rc_comment_data`,comment_rc_comment.comment_id AS `rc_comment_cid`,rc_title,rc_namespace,wl_user,wl_notificationtimestamp,we_expiry,page_latest,(SELECT GROUP_CONCAT(ctd_name SEPARATOR ',') FROM `change_tag` JOIN `change_tag_def` ON ((ct_tag_id=ctd_id)) WHERE (ct_rc_id=rc_id) ) AS `ts_tags` FROM `recentchanges` JOIN `actor` `recentchanges_actor` ON ((actor_id=rc_actor)) STRAIGHT_JOIN `comment` `comment_rc_comment` ON ((comment_rc_comment.comment_id = rc_comment_id)) LEFT JOIN `watchlist` ON (wl_user = 2134281 AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `watchlist_expiry` ON ((wl_id = we_item)) LEFT JOIN `page` ON ((rc_cur_id=page_id)) WHERE (((actor_user IS NOT NULL))) AND rc_bot = 0 AND (rc_type != 6) AND (rc_source != 'wb') AND (rc_namespace NOT IN (1198,1199,866,867)) AND (rc_timestamp >= '20240601170235') AND rc_new IN (0,1) ORDER BY rc_timestamp DESC LIMIT 50 ; +------+--------------------+---------------------+--------+-----------------------------------------------------------------------------------------------------------------------+--------------+---------+---------------------------------------------------------------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+--------------------+---------------------+--------+-----------------------------------------------------------------------------------------------------------------------+--------------+---------+---------------------------------------------------------------------------+--------+-------------+ | 1 | PRIMARY | recentchanges | range | rc_timestamp,rc_name_type_patrolled_timestamp,rc_ns_actor,rc_actor,rc_namespace_title_timestamp,rc_new_name_timestamp | rc_timestamp | 14 | NULL | 141164 | Using where | | 1 | PRIMARY | watchlist | eq_ref | wl_user,wl_user_notificationtimestamp,wl_namespace_title | wl_user | 265 | const,metawiki.recentchanges.rc_namespace,metawiki.recentchanges.rc_title | 1 | | | 1 | PRIMARY | watchlist_expiry | eq_ref | PRIMARY | PRIMARY | 4 | metawiki.watchlist.wl_id | 1 | Using where | | 1 | PRIMARY | recentchanges_actor | eq_ref | PRIMARY,actor_user | PRIMARY | 8 | metawiki.recentchanges.rc_actor | 1 | Using where | | 1 | PRIMARY | comment_rc_comment | eq_ref | PRIMARY | PRIMARY | 8 | metawiki.recentchanges.rc_comment_id | 1 | | | 1 | PRIMARY | page | eq_ref | PRIMARY | PRIMARY | 4 | metawiki.recentchanges.rc_cur_id | 1 | | | 2 | DEPENDENT SUBQUERY | change_tag | ref | ct_rc_tag_id,ct_tag_id_id | ct_rc_tag_id | 9 | metawiki.recentchanges.rc_id | 1 | Using index | | 2 | DEPENDENT SUBQUERY | change_tag_def | eq_ref | PRIMARY | PRIMARY | 4 | metawiki.change_tag.ct_tag_id | 1 | | +------+--------------------+---------------------+--------+-----------------------------------------------------------------------------------------------------------------------+--------------+---------+---------------------------------------------------------------------------+--------+-------------+ 8 rows in set (0.004 sec)
There's nothing else to be done from our side. I've double checked that the database didn't replicate to sanitarium hosts and a check_private_data run is clean
Thu, Jun 13
Yeah, I know changing that code is probably not ideal. I'll try to see what I can do with the optimizer stats and reproduce the same plan db1227 uses.
We need to make sure that the data checks (for PII leaks) get also executed on this new host. Is it having the same puppet role as clouddb1021?
In T367261#9887635, @ABran-WMF wrote:this error popped today:
10:05:14 <+icinga-wm_> PROBLEM - MariaDB Replica SQL: s2 on db2125 is CRITICAL: CRITICAL slave_sql_state Slave_SQL_Running: No, Errno: 1034, Errmsg: Error Index for table recentchanges is corrupt: try to repair it on query. Default database: cswiki. [Query snipped] https://wikitech.wikimedia.org/wiki/MariaDB/troubleshooting%23Depooling_a_replica
@Jhancock.wm reminder, we do not need AAAA records on these hosts.
After a couple of days, there's no noticiable differences in terms of replication performance in x2, between a host with 10.6 and a host with 10.11. The amount of inserts isn't huge in this section (around 150 writes per second).
This are for now the config differences:
root@cumin1002:~# sudo pt-config-diff --defaults-file /root/.my.cnf h=db1153.eqiad.wmnet h=db1151.eqiad.wmnet 24 config differences Variable db1153 db1151 ========================= ========================= ========================= basedir /opt/wmf-mariadb1011 /opt/wmf-mariadb106 character_sets_dir /opt/wmf-mariadb1011/s... /opt/wmf-mariadb106/sh... explicit_defaults_for_... ON OFF general_log_file db1153.log db1151.log gtid_binlog_pos 171966470-171966470-14... 171966470-171966470-14... gtid_binlog_state 171966470-171966470-14... 171966470-171966470-14... gtid_current_pos 0-171970580-683331037,... 0-171970580-683331037,... gtid_domain_id 171978800 171966470 gtid_slave_pos 0-171970580-683331037,... 0-171970580-683331037,... hostname db1153 db1151 innodb_buffer_pool_chu... 6325010432 134217728 innodb_prefix_index_cl... ON OFF log_bin_basename /srv/sqldata/db1153-bin /srv/sqldata/db1151-bin log_bin_index /srv/sqldata/db1153-bi... /srv/sqldata/db1151-bi... optimizer_prune_level 2 1 pid_file /srv/sqldata/db1153.pid /srv/sqldata/db1151.pid plugin_dir /opt/wmf-mariadb1011/l... /opt/wmf-mariadb106/li... report_host db1153.eqiad.wmnet db1151.eqiad.wmnet server_id 171978800 171966470 slave_transaction_retr... 1158,1159,1160,1161,12... 1158,1159,1160,1161,12... slow_query_log_file db1153-slow.log db1151-slow.log version 10.11.8-MariaDB-log 10.6.16-MariaDB-log version_source_revision 3a069644682e336e445039... b83c379420a8846ae4b287... wsrep_node_name db1153 db1151
I am running this on the s4 candidate master, so it is ready for T367378: Switchover s4 master (db1160 -> db1238)