Page MenuHomePhabricator

Wikimedia\Rdbms\Database::selectSQLText: aggregation used with a locking SELECT (NewsletterDb::addNewsletterIssue)
Open, LowestPublic

Description

Error message
[deprecated]

 Wikimedia\Rdbms\Database::selectSQLText: aggregation used with a locking SELECT (NewsletterDb::addNewsletterIssue)
Stack Trace
N/A
Impact

Given it is a deprecation warning, it most likely means things are working as intended today. But this may stop working in the future.

Notes

Querying Logstash over the last 30 days, shows it's been triggered every week, so likely this is not a recent regression. The detection for this issue was introduced in 2018 as part of https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/425731/.

Details

Request ID
574d4853-2114-44e7-8bb3-26c67bcb70e7
Request URL
https://www.mediawiki.org/wiki/Special:Newsletter/3/announce

Event Timeline

Krinkle created this task.May 28 2020, 8:59 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 28 2020, 8:59 PM
Krinkle updated the task description. (Show Details)May 28 2020, 9:20 PM

This is low priority as the syntax is still supported in RDBMS today, however, it's unsure whether this is intentionally using that. Part of the reason to deprecate this in rdbms was that it is often used by accident, so this would be good to check out and make sure that's not to case.

Migration of this blocks removing of that trap from wikimedia-rdbms.

eprodromou triaged this task as Lowest priority.Jun 18 2020, 7:43 PM
eprodromou added a subscriber: eprodromou.

This seems to be a potential problem, but not one we can put time into right now. Iceboxing until it becomes more urgent.

eprodromou raised the priority of this task from Lowest to Needs Triage.Jun 18 2020, 7:44 PM
eprodromou triaged this task as Lowest priority.
eprodromou moved this task from Inbox to Icebox on the Platform Engineering board.
		$lastIssueId = (int)$dbw->selectField(
			'nl_issues',
			'MAX(nli_issue_id)',
			[ 'nli_newsletter_id' => $newsletter->getId() ],
			__METHOD__,
			[ 'FOR UPDATE' ]
		);

auto increment could be used for nli_issue_id, but than the id is unique for all newsletter issues and not per newsletter (nli_newsletter_id).
I have no idea how unique the number should/must be.

But the table also needs some investigation for T17441: Some tables lack unique or primary keys, may allow confusing duplicate data

One solution could be: No locking on the max value, trying to insert and when it fails, reselect and try again.
I have no knowledge how locking works here in mysql. If only the last row is locked, that useless as that is not updated.
If the gap after the max is locked, than it is useful to avoid another thread to insert something.

Tgr added a subscriber: Tgr.Fri, Jul 24, 10:22 AM

The comment for that warning says

// Some DB types (e.g. postgres) disallow FOR UPDATE with aggregate
// functions. Discourage use of such queries to encourage compatibility.

So this is a bug but not really a production bug. The fix is simply to use two separate queries.

I have no knowledge how locking works here in mysql. If only the last row is locked, that useless as that is not updated.
If the gap after the max is locked, than it is useful to avoid another thread to insert something.

The relevant index is (nli_newsletter_id, nli_issue_id) and the query condition is nli_newsletter_id = <some_newsletter_id> so the gap after (<some_newsletter_id>, <max_issue_id>) will be locked to ensure that nothing else can add/change data in the index range defined by that condition. (Or maybe MySQL can set index row locks on prefix subsets of composite indexes? Not sure, but the end result would be the same.) The locking logic is fine here, it's just that getting the max ID in the same query is a micro-optimization that only works in MySQL.