Page MenuHomePhabricator

Ladsgroup (Amir Sarabadani)
Shah of Bugs, Emir of database architecture, World-renowned rubber duckAdministrator

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Oct 6 2014, 9:53 PM (397 w, 3 d)
Roles
Administrator
Availability
Available
IRC Nick
Amir1
LDAP User
Ladsgroup
MediaWiki User
Ladsgroup [ Global Accounts ]

Staff Database Architect in SRE data persistence team in WMF. Used to be Wikidata software engineer in WMDE

I'm also open source enthusiast, mediawiki volunteer developer, and long-term Wikipedian.

All edits on tickets about databases are in my work capacity and anything else is in my volunteer capacity unless mentioned otherwise.

Babel: fa-N, en-4, de-2, tr-1, hu-1

Recent Activity

Today

Ladsgroup closed T139052: Portal Page Update: wiktionary.org as Resolved.

Let's call it done.

Fri, May 20, 2:25 PM · Patch-Needs-Improvement, Discovery-Portal-Backlog, Wikimedia-Portals, Discovery
Ladsgroup closed T139052: Portal Page Update: wiktionary.org , a subtask of T230593: Unify the display of all Wikimedia portals, as Resolved.
Fri, May 20, 2:25 PM · Epic, Discovery-Portal-Sprint, Design, Wikimedia-Portals
Ladsgroup closed T305283: Add Wikidata support for kcgwiki, a subtask of T305281: Post-creation work for kcgwiki, as Resolved.
Fri, May 20, 1:15 PM · Wiki-Setup
Ladsgroup closed T305283: Add Wikidata support for kcgwiki as Resolved.
Fri, May 20, 1:15 PM · Wikidata, Wikidata-Campsite
Ladsgroup claimed T305283: Add Wikidata support for kcgwiki.

I'm doing it

Fri, May 20, 12:41 PM · Wikidata, Wikidata-Campsite
Ladsgroup added a comment to T308380: 2022-05-14 Databases.

db1131` was not depooled immediately (later for T298555 which should be unrelated?).

Fri, May 20, 11:23 AM · Wikimedia-production-error, DBA, Wikimedia-Incident
Ladsgroup updated the task description for T303603: Add actor and comment columns to cu_changes.
Fri, May 20, 11:15 AM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T298555: Fix mismatching field type of logging.log_timestamp on wmf wikis.
Fri, May 20, 11:12 AM · Blocked-on-schema-change, DBA
Ladsgroup added a comment to T304629: Deploy updated www.wiktionary.org page.

3 questions/notes:

Also remember to inform the Meta administrators on this talk page so they can decommission the system of modules that notifies them when the Wiktionary portal template needs to be updated.

Reminder of this action-item.

Fri, May 20, 8:41 AM · Patch-For-Review, User-notice, Wikimedia-Portals
Ladsgroup closed T307738: Actor name can not be empty for 0 and 3215898 as Resolved.

Fixed.

Fri, May 20, 8:38 AM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Wikimedia-production-error, MediaWiki-User-management, MediaWiki-REST-API
Ladsgroup added a comment to T307738: Actor name can not be empty for 0 and 3215898.

Found it:

ladsgroup@mwmaint1002:~$ mwscript maintenance/findMissingActors.php --wiki=enwikibooks --field rev_actor --type broken
Finding invalid actor IDs in revision.rev_actor...
		ID	ACTOR
		32811	3215898
		33371	3215898
		1960367	3215898
		1960368	3215898
		1960370	3215898
		1960371	3215898
		1960372	3215898
		1960374	3215898
		1960375	3215898
		1960376	3215898
		1960377	3215898
		1960378	3215898
		1960379	3215898
		1890748	3215898
		1890967	3215898
		1890759	3215898
		1765627	3215898
	Found 17 invalid actor IDs.
Done.
Fri, May 20, 8:37 AM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Wikimedia-production-error, MediaWiki-User-management, MediaWiki-REST-API

Yesterday

Ladsgroup updated the task description for T306560: Fix nullability of img_major_mime and oi_major_mime.
Thu, May 19, 11:34 AM · DBA, Blocked-on-schema-change, User-Ladsgroup
Ladsgroup closed T301312: Switchover s1 master (db1118 -> db1163) as Resolved.
Thu, May 19, 6:19 AM · DBA
Ladsgroup closed T301312: Switchover s1 master (db1118 -> db1163), a subtask of T300402: Add namespace column to Linter table, as Resolved.
Thu, May 19, 6:19 AM · MediaWiki-extensions-Linter, DBA, Blocked-on-schema-change
Ladsgroup closed T301312: Switchover s1 master (db1118 -> db1163), a subtask of T300774: Drop fr_img_* columns, as Resolved.
Thu, May 19, 6:19 AM · DBA, Blocked-on-schema-change
Ladsgroup closed T301312: Switchover s1 master (db1118 -> db1163), a subtask of T303171: Upgrade s1 to Bullseye, as Resolved.
Thu, May 19, 6:19 AM · DBA
Ladsgroup updated the task description for T300774: Drop fr_img_* columns.
Thu, May 19, 6:18 AM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T300402: Add namespace column to Linter table.
Thu, May 19, 6:18 AM · MediaWiki-extensions-Linter, DBA, Blocked-on-schema-change
Ladsgroup closed T300402: Add namespace column to Linter table, a subtask of T299612: Add namespace column and index to table, as Resolved.
Thu, May 19, 6:18 AM · MW-1.38-notes (1.38.0-wmf.20; 2022-01-31), Patch-For-Review, Documentation
Ladsgroup closed T300402: Add namespace column to Linter table as Resolved.

I did s1's old master db1118 now. Sorry for the delay.

Thu, May 19, 6:18 AM · MediaWiki-extensions-Linter, DBA, Blocked-on-schema-change
Ladsgroup closed T300992: Add linter_template and linter_tag columns to the Linter table, a subtask of T175177: Linter UI: Provide option to filter linter errors by second column (ex: obsolete tag, missing end tag, stripped tag, etc), as Resolved.
Thu, May 19, 6:15 AM · MW-1.38-notes (1.38.0-wmf.21; 2022-02-07), Patch-For-Review, MediaWiki-extensions-Linter
Ladsgroup closed T300992: Add linter_template and linter_tag columns to the Linter table as Resolved.
Thu, May 19, 6:15 AM · User-Ladsgroup, DBA, Blocked-on-schema-change, MediaWiki-extensions-Linter
Ladsgroup added a comment to T300992: Add linter_template and linter_tag columns to the Linter table.

This is done now. I sincerely apologize for the delay and will make sure this won't happen again.

Thu, May 19, 6:15 AM · User-Ladsgroup, DBA, Blocked-on-schema-change, MediaWiki-extensions-Linter
Ladsgroup updated the task description for T300992: Add linter_template and linter_tag columns to the Linter table.
Thu, May 19, 6:14 AM · User-Ladsgroup, DBA, Blocked-on-schema-change, MediaWiki-extensions-Linter
Ladsgroup updated the task description for T301312: Switchover s1 master (db1118 -> db1163).
Thu, May 19, 6:05 AM · DBA
Ladsgroup updated the task description for T301312: Switchover s1 master (db1118 -> db1163).
Thu, May 19, 5:58 AM · DBA
Ladsgroup added a comment to T301312: Switchover s1 master (db1118 -> db1163).

Noted

Thu, May 19, 5:53 AM · DBA
Ladsgroup updated the task description for T301312: Switchover s1 master (db1118 -> db1163).
Thu, May 19, 5:51 AM · DBA
Ladsgroup updated the task description for T301312: Switchover s1 master (db1118 -> db1163).
Thu, May 19, 5:34 AM · DBA
Ladsgroup updated the task description for T303603: Add actor and comment columns to cu_changes.
Thu, May 19, 3:00 AM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T303603: Add actor and comment columns to cu_changes.
Thu, May 19, 3:00 AM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T303603: Add actor and comment columns to cu_changes.
Thu, May 19, 2:53 AM · DBA, Blocked-on-schema-change

Wed, May 18

Ladsgroup added a comment to T305218: 1.39.0-wmf.12 deployment blockers.

The json truncated thing is fixed now:

image.png (380×979 px, 42 KB)

Wed, May 18, 10:32 PM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
Ladsgroup updated subscribers of T305218: 1.39.0-wmf.12 deployment blockers.

This one was probably caused by yours truly: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/791702

Wed, May 18, 9:25 PM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
Ladsgroup updated the task description for T298555: Fix mismatching field type of logging.log_timestamp on wmf wikis.
Wed, May 18, 7:46 PM · Blocked-on-schema-change, DBA
Ladsgroup updated the task description for T303603: Add actor and comment columns to cu_changes.
Wed, May 18, 7:45 PM · DBA, Blocked-on-schema-change
Ladsgroup added a comment to T308084: Reduce DB space used by Echo notifications.

So far cleaned 50M rows:

mysql:research@x1-analytics-replica.eqiad.wmnet [wikidatawiki]> select count(*) from echo_event left join echo_notification on event_id = notification_event where notification_user is null limit 50;
+-----------+
| count(*)  |
+-----------+
| 323307873 |
+-----------+
1 row in set (19 min 20.872 sec)
Wed, May 18, 7:42 PM · Data-Persistence (Consultation), Growth-Team, Wikimedia-database-issue, Notifications
Ladsgroup triaged T308691: Fatal exception of type "CannotCreateActorException" when trying to export file from zhwikibooks to commons as Unbreak Now! priority.

Train blocker = UBN!

Wed, May 18, 7:39 PM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Patch-For-Review, WMDE-TechWish-Maintenance, Unplanned-Sprint-Work, WMDE-TechWish-Sprint-2022-05-11, Platform Engineering, Move-Files-To-Commons, Wikimedia-production-error
Ladsgroup added a comment to T308663: LogicException: This ParserOutput contains no text!.

Did we identify the root cause? (ie, what was the change being deployed that caused this to trigger?)

Wed, May 18, 4:05 PM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Platform Engineering, CommonsMetadata, MediaWiki-Parser, Wikimedia-production-error
Ladsgroup closed T308663: LogicException: This ParserOutput contains no text! as Resolved.

Testing it in test commons it seems it's fixed now: https://test-commons.wikimedia.org/w/index.php?title=File:Watch-foop.svg&oldid=4408

Wed, May 18, 3:37 PM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Platform Engineering, CommonsMetadata, MediaWiki-Parser, Wikimedia-production-error
Ladsgroup closed T308663: LogicException: This ParserOutput contains no text!, a subtask of T305218: 1.39.0-wmf.12 deployment blockers, as Resolved.
Wed, May 18, 3:37 PM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
Ladsgroup added a project to T308663: LogicException: This ParserOutput contains no text!: Platform Engineering.

This seems like an issue for the platform team.

Wed, May 18, 2:09 PM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Platform Engineering, CommonsMetadata, MediaWiki-Parser, Wikimedia-production-error
Ladsgroup added a project to T308663: LogicException: This ParserOutput contains no text!: CommonsMetadata.
Wed, May 18, 1:49 PM · MW-1.39-notes (1.39.0-wmf.13; 2022-05-23), Platform Engineering, CommonsMetadata, MediaWiki-Parser, Wikimedia-production-error
Ladsgroup added a comment to T307328: Scalability issues of recentchanges table.

Some notes:

  • The total number of rows in rc table is not the full story, for example it's being used heavily, any increase in pressure from data in other tables can drastically increase the pressure on rc table. i.e. everything is connected.
  • Just looking at the number wouldn't tell how is the distribution of the data. For example, an increase in non-bot edits can have a different impact than increase in bot edits given the different indexes and access patterns.
  • Being large a couple of years ago also doesn't mean it wasn't a problem back then. We used to have basically an outage a month caused by the wikidata's db and we still have issues with that. Just last Sunday all of Wikipedia went down for a couple of minutes because of the pressure on s8.
  • RC table is designed to be small and nimble, for example it has ten indexes. Breaking that assumption causes all sorts of unexpected issues. I want to keep that assumption true not just for current usecases but also for future ones.
Wed, May 18, 11:41 AM · Performance-Team (Radar), Growth-Team, Data-Persistence (Consultation), MediaWiki-Recent-changes
Ladsgroup added a comment to T299421: Turn on write both in production for templatelinks normalization.

FWIW subquery didn't fix it:

SELECT  lt_title,tl_from_namespace,tl_from,lt_namespace AS `bl_namespace`,lt_title AS `bl_title`,page_namespace,page_title,page_id,page_content_model,page_is_redirect,page_is_new,page_latest,page_touched,page_len,page_restrictions  FROM `page`,`templatelinks` FORCE INDEX (tl_backlinks_namespace_target_id) JOIN `linktarget` ON ((tl_target_id=lt_id))   WHERE tl_from_namespace IN (0,1,2,3,4,5,12,13)  AND (tl_from = page_id) AND tl_target_id in (select lt_id from linktarget where (lt_namespace = 10 AND lt_title IN ('SPE','Semi-protection_étendue') )) AND page_is_redirect = 0  ORDER BY lt_title,tl_from_namespace,tl_from LIMIT 501
Wed, May 18, 11:14 AM · Patch-For-Review, DBA, User-Ladsgroup, Platform Engineering
Ladsgroup added a project to T306963: Integrate new section mapping database: Data-Persistence (Consultation).

Hi!
Yeah, putting it in gerrit wouldn't work. it's going to bloat the whole repo and its history and cloning it in the future can cause all sorts issues for the server (from apache to gerrit's db) and client. If it's data, it belongs to a database not a vcs.

Wed, May 18, 11:11 AM · Data-Persistence (Consultation), Patch-For-Review, Language-Team (Language-2022-April-June), SectionTranslation
Ladsgroup closed T308640: Error: Call to undefined method CirrusSearch\Connection::getPageType(), a subtask of T305218: 1.39.0-wmf.12 deployment blockers, as Resolved.
Wed, May 18, 9:09 AM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Patch-For-Review, Release-Engineering-Team (Priority Backlog 📥), Release, Train Deployments
Ladsgroup closed T308640: Error: Call to undefined method CirrusSearch\Connection::getPageType() as Resolved.

I fixed the immediate problem by backporting the patch:

image.png (228×652 px, 18 KB)

Wed, May 18, 9:09 AM · Discovery-Search, GeoData, Wikimedia-production-error
Ladsgroup updated the task description for T298555: Fix mismatching field type of logging.log_timestamp on wmf wikis.
Wed, May 18, 8:05 AM · Blocked-on-schema-change, DBA
Ladsgroup updated the task description for T303603: Add actor and comment columns to cu_changes.
Wed, May 18, 7:59 AM · DBA, Blocked-on-schema-change

Tue, May 17

Ladsgroup updated the task description for T300774: Drop fr_img_* columns.
Tue, May 17, 9:33 PM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T300774: Drop fr_img_* columns.
Tue, May 17, 5:25 PM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T303603: Add actor and comment columns to cu_changes.
Tue, May 17, 5:22 PM · DBA, Blocked-on-schema-change
Ladsgroup added a comment to T298555: Fix mismatching field type of logging.log_timestamp on wmf wikis.

I stopped it before the schema change. Since we have cumin reboot tomorrow. I don't want the schema change hanging atm.

Tue, May 17, 4:33 PM · Blocked-on-schema-change, DBA
Ladsgroup added a comment to T308443: Phan broken due to ResourceLoader namespace move.

FWIW the current blocker of php7.4 migration work seems to be T295578

Tue, May 17, 4:24 PM · Patch-For-Review, phan, Wikidata, ci-test-error (WMF-deployed Build Failure)
Ladsgroup updated the task description for T300774: Drop fr_img_* columns.
Tue, May 17, 1:53 PM · DBA, Blocked-on-schema-change
Ladsgroup moved T298555: Fix mismatching field type of logging.log_timestamp on wmf wikis from Backlog to In progress on the Blocked-on-schema-change board.
Tue, May 17, 1:08 PM · Blocked-on-schema-change, DBA
Ladsgroup moved T298560: Fix mismatching field type of revision.rev_timestamp on wmf wikis from Backlog to In progress on the Blocked-on-schema-change board.
Tue, May 17, 1:08 PM · Blocked-on-schema-change, DBA
Ladsgroup moved T303603: Add actor and comment columns to cu_changes from Backlog to In progress on the Blocked-on-schema-change board.
Tue, May 17, 1:08 PM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T300774: Drop fr_img_* columns.
Tue, May 17, 1:04 PM · DBA, Blocked-on-schema-change
Ladsgroup claimed T300774: Drop fr_img_* columns.

Kormat is out sick. I take over.

Tue, May 17, 1:04 PM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T303603: Add actor and comment columns to cu_changes.
Tue, May 17, 12:43 PM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T298560: Fix mismatching field type of revision.rev_timestamp on wmf wikis.
Tue, May 17, 12:42 PM · Blocked-on-schema-change, DBA
Ladsgroup updated the task description for T298555: Fix mismatching field type of logging.log_timestamp on wmf wikis.
Tue, May 17, 12:40 PM · Blocked-on-schema-change, DBA
Ladsgroup updated the task description for T303603: Add actor and comment columns to cu_changes.
Tue, May 17, 12:21 PM · DBA, Blocked-on-schema-change
Ladsgroup claimed T298555: Fix mismatching field type of logging.log_timestamp on wmf wikis.

Kormat is out sick. I take over.

Tue, May 17, 12:14 PM · Blocked-on-schema-change, DBA
Ladsgroup claimed T298560: Fix mismatching field type of revision.rev_timestamp on wmf wikis.

Kormat is out sick. I take over.

Tue, May 17, 12:14 PM · Blocked-on-schema-change, DBA
Ladsgroup claimed T303603: Add actor and comment columns to cu_changes.

Kormat is out sick. I take over.

Tue, May 17, 12:13 PM · DBA, Blocked-on-schema-change
Ladsgroup updated the task description for T301312: Switchover s1 master (db1118 -> db1163).
Tue, May 17, 11:26 AM · DBA
Ladsgroup claimed T301312: Switchover s1 master (db1118 -> db1163).

Stevie Beth is out sick. So I'm taking over. Will do it this Thursday.

Tue, May 17, 11:21 AM · DBA
Ladsgroup added a comment to T299421: Turn on write both in production for templatelinks normalization.

Join decomposition makes it instant:

MariaDB [frwiki]> SELECT  lt_title,tl_from_namespace,tl_from,lt_namespace AS `bl_namespace`,lt_title AS `bl_title`,page_namespace,page_title,page_id,page_content_model,page_is_redirect,page_is_new,page_latest,page_touched,page_len,page_restrictions  FROM `page`,`templatelinks` FORCE INDEX (tl_backlinks_namespace_target_id) JOIN `linktarget` ON ((tl_target_id=lt_id))   WHERE tl_from_namespace IN (0,1,2,3,4,5,12,13)  AND (tl_from = page_id) AND tl_target_id IN (626975, 5802) AND page_is_redirect = 0  ORDER BY lt_title,tl_from_namespace,tl_from LIMIT 501;
...
298 rows in set (0.006 sec)
Tue, May 17, 11:11 AM · Patch-For-Review, DBA, User-Ladsgroup, Platform Engineering
Ladsgroup added a comment to T299421: Turn on write both in production for templatelinks normalization.

Explain:

MariaDB [frwiki]> explain SELECT  lt_title,tl_from_namespace,tl_from,lt_namespace AS `bl_namespace`,lt_title AS `bl_title`,page_namespace,page_title,page_id,page_content_model,page_is_redirect,page_is_new,page_latest,page_touched,page_len,page_restrictions  FROM `page`,`templatelinks` FORCE INDEX (tl_backlinks_namespace_target_id) JOIN `linktarget` ON ((tl_target_id=lt_id))   WHERE tl_from_namespace IN
(0,1,2,3,4,5,12,13)  AND (tl_from = page_id) AND ((lt_namespace = 10 AND lt_title IN ('SPE','Semi-protection_étendue') )) AND page_is_redirect = 0  ORDER BY lt_title,tl_from_namespace,tl_from LIMIT 5  ;
+------+-------------+---------------+------------+-------------------------------------+--------------------------------------------+---------+------------------------------+-----------+---------------------------------------------------------------+
| id   | select_type | table         | type       | possible_keys                       | key                                        | key_len | ref                          | rows      | Extra                                                         |
+------+-------------+---------------+------------+-------------------------------------+--------------------------------------------+---------+------------------------------+-----------+---------------------------------------------------------------+
|    1 | SIMPLE      | linktarget    | range      | PRIMARY,lt_namespace_title          | lt_namespace_title                         | 261     | NULL                         | 2         | Using where; Using index; Using temporary; Using filesort     |
|    1 | SIMPLE      | templatelinks | hash_range | tl_backlinks_namespace_target_id    | #hash#$hj:tl_backlinks_namespace_target_id | 9:4     | frwiki.linktarget.lt_id      | 191341573 | Using where; Using index; Using join buffer (flat, BNLH join) |
|    1 | SIMPLE      | page          | eq_ref     | PRIMARY,page_redirect_namespace_len | PRIMARY                                    | 4       | frwiki.templatelinks.tl_from | 1         | Using where                                                   |
+------+-------------+---------------+------------+-------------------------------------+--------------------------------------------+---------+------------------------------+-----------+---------------------------------------------------------------+
3 rows in set (0.057 sec)
Tue, May 17, 11:08 AM · Patch-For-Review, DBA, User-Ladsgroup, Platform Engineering
Ladsgroup added a comment to T248418: Roll out videojs as the only video/audio player on all Wikimedia wikis.

Hi, sure thing! I will collect the open design questions and bring them to the meeting.

Tue, May 17, 11:06 AM · User-Ladsgroup, Patch-For-Review, Readers-Web-Backlog, User-notice, Wikimedia-Site-requests, VideoJS player, Performance-Team (Radar), Multimedia, Wikimedia-Video
Ladsgroup added a comment to T299421: Turn on write both in production for templatelinks normalization.

Found another slow query in frwiki:

SELECT  lt_title,tl_from_namespace,tl_from,lt_namespace AS `bl_namespace`,lt_title AS `bl_title`,page_namespace,page_title,page_id,page_content_model,page_is_redirect,page_is_new,page_latest,page_touched,page_len,page_restrictions  FROM `page`,`templatelinks` FORCE INDEX (tl_backlinks_namespace_target_id) JOIN `linktarget` ON ((tl_target_id=lt_id))   WHERE tl_from_namespace IN (0,1,2,3,4,5,12,13)  AND (tl_from = page_id) AND ((lt_namespace = 10 AND lt_title IN ('SPE','Semi-protection_étendue') )) AND page_is_redirect = 0  ORDER BY lt_title,tl_from_namespace,tl_from LIMIT 501
Tue, May 17, 11:04 AM · Patch-For-Review, DBA, User-Ladsgroup, Platform Engineering
Ladsgroup closed T307295: Bot contributions page in Catalan wikipedia not displayed as Resolved.
Tue, May 17, 8:59 AM · MW-1.39-notes (1.39.0-wmf.10; 2022-05-02), DBA, Wikimedia-production-error, Slow-DB-Query, MediaWiki-Logevents
Ladsgroup updated the task description for T299424: Run maintenance script backfilling tl_title_id.
Tue, May 17, 8:10 AM · MW-1.39-notes (1.39.0-wmf.8; 2022-04-18), DBA, User-Ladsgroup, Platform Engineering

Mon, May 16

Ladsgroup added a project to T308454: db2083 network issue: ops-codfw.

It might be a loose cable. Can you check please?

Mon, May 16, 4:31 PM · SRE, ops-codfw, DBA
Ladsgroup created T308454: db2083 network issue.
Mon, May 16, 4:27 PM · SRE, ops-codfw, DBA
Ladsgroup added a comment to T305901: Reset geo_tags auto_increment value to 1 in dewiki.

*sigh* InnoDB doesn't allow reset to 1. I tried it on testwiki.

Mon, May 16, 3:05 PM · DBA
Ladsgroup added a comment to T308084: Reduce DB space used by Echo notifications.

Started doing another round of clean up. It'll take a couple of days.

Mon, May 16, 11:45 AM · Data-Persistence (Consultation), Growth-Team, Wikimedia-database-issue, Notifications
Ladsgroup closed T308207: ApiQueryInfo::getProtectionInfo is slow on normalized templatelinks as Resolved.
Mon, May 16, 11:12 AM · MW-1.39-notes (1.39.0-wmf.10; 2022-05-02), Patch-For-Review, DBA
Ladsgroup closed T308207: ApiQueryInfo::getProtectionInfo is slow on normalized templatelinks, a subtask of T299417: Normalize templatelinks table, as Resolved.
Mon, May 16, 11:12 AM · DBA, Platform Engineering, User-Ladsgroup
Ladsgroup added a comment to T307328: Scalability issues of recentchanges table.

Thanks for the investigation!!!

Mon, May 16, 10:49 AM · Performance-Team (Radar), Growth-Team, Data-Persistence (Consultation), MediaWiki-Recent-changes
Ladsgroup added a comment to T298485: MW scripts should reload the database config.

One way to tackle this and T305016 is to make maint scripts have one replica which would be a proxy db (possibly proxysql) and make all of the read queries to that db. Maint scripts already use a different db user, making them have a different replica set should be pretty easy and it would solve a lot of problems including db config reload, etc. etc.

Mon, May 16, 10:24 AM · Performance-Team (Radar), MediaWiki-Maintenance-system, User-Ladsgroup, DBA
Ladsgroup added a comment to T303596: DiscussionTools is triggering false duplicate parse logging in each of its API calls.

SGTM

Mon, May 16, 10:14 AM · MW-1.39-notes (1.39.0-wmf.12; 2022-05-16), Editing-team (FY2021-22 Kanban Board), DiscussionTools, Data-Persistence (Consultation), Regression

Sun, May 15

Ladsgroup added a comment to T307295: Bot contributions page in Catalan wikipedia not displayed.

Yeah, I'll patch it tomorrow. It wouldn't work in wikis that read from the temp table but we stopped reading from that table (I think that's why this bug started in the first place).

Sun, May 15, 9:47 AM · MW-1.39-notes (1.39.0-wmf.10; 2022-05-02), DBA, Wikimedia-production-error, Slow-DB-Query, MediaWiki-Logevents
Ladsgroup added a comment to T307295: Bot contributions page in Catalan wikipedia not displayed.

FWIW, I tried optimize table on cawiki in a codfw db and it didn't fix the problem

Sun, May 15, 9:36 AM · MW-1.39-notes (1.39.0-wmf.10; 2022-05-02), DBA, Wikimedia-production-error, Slow-DB-Query, MediaWiki-Logevents
Ladsgroup added a comment to T307295: Bot contributions page in Catalan wikipedia not displayed.

Forcing the right index makes it respond with 0.04 seconds:

MariaDB [cawiki]> SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,actor_rev_user.actor_user AS `rev_user`,actor_rev_user.actor_name AS `rev_user_text`,rev_actor,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name,page_is_new,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_rev_id=rev_id  ) AS `ts_tags`,ores_damaging_cls.oresc_probability AS `ores_damaging_score`,0.728 AS `ores_damaging_threshold`  FROM `revision` USE INDEX (rev_actor_timestamp) JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = rev_actor)) JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((actor_rev_user.actor_user != 0) AND (user_id = actor_rev_user.actor_user)) LEFT JOIN `ores_classification` `ores_damaging_cls` ON (ores_damaging_cls.oresc_model = 7 AND (ores_damaging_cls.oresc_rev=rev_id) AND ores_damaging_cls.oresc_class = 1)   WHERE ((rev_actor = 5684)) AND ((rev_deleted & 4) = 0)  ORDER BY rev_timestamp DESC,rev_id DESC LIMIT 51 ;
Sun, May 15, 9:15 AM · MW-1.39-notes (1.39.0-wmf.10; 2022-05-02), DBA, Wikimedia-production-error, Slow-DB-Query, MediaWiki-Logevents
Ladsgroup added a comment to T307295: Bot contributions page in Catalan wikipedia not displayed.

So in general Special:Contributions with a mix of complex filters can lead to slow queries and timeouts but the weird case here is that Special:Contributions without any filters must be supported and it even has a dedicated index (INDEX rev_actor_timestamp (rev_actor, rev_timestamp, rev_id)), picking up page table first and then querying with revision index of rev_page_actor_timestamp is the wrong action from the query planner. It can be that due to populating rev_actor, it doesn't have the correct statistics and an optimize table revision would hopefully fix it. If not, I check what could have caused it. In that case, probably making the join on page LEFT JOIN and adding page_id IS NOT NULL to conditions might hint to the planner it shouldn't pick page table first. Or straight join. Will see.

Sun, May 15, 8:51 AM · MW-1.39-notes (1.39.0-wmf.10; 2022-05-02), DBA, Wikimedia-production-error, Slow-DB-Query, MediaWiki-Logevents

Sat, May 14

Ladsgroup added a comment to T308380: 2022-05-14 Databases.

s8 has so many replicas that I think its weights should rebalance automatically using some metrics like connection contention, latency, etc. I will see what I can do about it next week.

Sat, May 14, 10:55 AM · Wikimedia-production-error, DBA, Wikimedia-Incident

Fri, May 13

Ladsgroup updated the task description for T308013: Assign SPDX headers to puppet.git.
Fri, May 13, 1:03 PM · Patch-For-Review, Infrastructure-Foundations, SRE
Ladsgroup added a comment to T308084: Reduce DB space used by Echo notifications.

sigh. So the original query is this:

SELECT  event_id  FROM `echo_event` LEFT JOIN `echo_notification` ON ((notification_event=event_id)) LEFT JOIN `echo_email_batch` ON ((eeb_event_id=event_id))   WHERE notification_user IS NULL AND eeb_user_id IS NULL  ORDER BY event_id ASC LIMIT 500;

And it's extremely slow even when you force PRIMARY index on echo_event. The only way to make it work is to add an upper limit condition on event_id. Up to event_id being 360M there is not much left, but after that, it'll get really slow, so I had to manually edit the script file, add event_id below < 362M, let it run and delete 800K rows, then change the condition to event_id below 363M and again. I need to do this for around 400 times (It' be less if I make it jump even two or three million event_id) but yeah, fun stuff. I haven't even looked at other wikis.

Fri, May 13, 9:20 AM · Data-Persistence (Consultation), Growth-Team, Wikimedia-database-issue, Notifications
Ladsgroup added a comment to T308084: Reduce DB space used by Echo notifications.

It is basically impossible to run this on wikidata now :/ It is so slow that I'm worried it might bring down everything. The query itself doesn't seem too bad. Let me investigate.

Fri, May 13, 7:36 AM · Data-Persistence (Consultation), Growth-Team, Wikimedia-database-issue, Notifications
Ladsgroup added a comment to T273375: Raise minimum supported MySQL version to MySQL 5.6 (or later).

From production point of view:

  • We don't have MySQL so I really can't tell you which version would work and which one wouldn't
  • MariaDB: We fully got rid of 10.1, most of the infra is on 10.4 (10.4.22-10.4.24), We are migrating to 10.6 with several replicas now fully serving 10.6 in s1, s7 and s8.
    • The plan is to fully move to 10.6 later in the year.
  • OS: We got rid of stretch, most of the infra is in bullseye and hopefully will be 100% later this quarter.
Fri, May 13, 7:07 AM · Data-Persistence (Consultation), Proposal, MediaWiki-Installer
Ladsgroup added a comment to T256190: Remove or remodel small site ownership icons which used in every MediaWiki skin.

I'm sorta on it. Started the discussions about this a week ago on resourcing and timeline. No promises. It doesn't have to be done at the same time of DIP specially since it means more work for reading web team.

Fri, May 13, 6:58 AM · Design, MediaWiki-General, Accessibility

Thu, May 12

Ladsgroup added a comment to T307328: Scalability issues of recentchanges table.

I think with that context, it might make sense to exclude maintenance categories and other categories marked as "hidden". To my knowledge as a user and administrator of Wikimedia Commons, these are primarily monitored (if at all) through navigating to that category to work off the list in real-time, e.g. categories with missing information or requests for speedy deletion etc., not through one's watchlist where it would remain long after the request for information or admin action is dealt with.

Thu, May 12, 4:38 PM · Performance-Team (Radar), Growth-Team, Data-Persistence (Consultation), MediaWiki-Recent-changes
Ladsgroup added a comment to T307328: Scalability issues of recentchanges table.

In case of commons, 71% of 19M rows of the rc table is from mw.categorize source (T9148: Watch edits which add or remove pages from a category) Maybe that can be disabled in commons?

Thu, May 12, 2:12 PM · Performance-Team (Radar), Growth-Team, Data-Persistence (Consultation), MediaWiki-Recent-changes
Ladsgroup created P27810 (An Untitled Masterwork).
Thu, May 12, 1:52 PM
Ladsgroup added a subtask for T303171: Upgrade s1 to Bullseye: T301312: Switchover s1 master (db1118 -> db1163).
Thu, May 12, 1:38 PM · DBA
Ladsgroup added a parent task for T301312: Switchover s1 master (db1118 -> db1163): T303171: Upgrade s1 to Bullseye.
Thu, May 12, 1:38 PM · DBA