Page MenuHomePhabricator

SelectQueryBuilder::fetchRowCount() always returns 1 for DISTINCT queries
Closed, ResolvedPublicBUG REPORT

Description

Steps to reproduce

  1. Take a wiki that has pages in more than one namespace.
  2. Open a PHP shell (e.g. maintenance/shell.php).
  3. Issue the following command: MediaWiki\MediaWikiServices::getInstance()->getDBLoadBalancer()->getConnection( DB_REPLICA )->newSelectQueryBuilder()->from( 'page' )->select( 'page_namespace' )->distinct()->fetchFieldValues();
  4. Notice that it returns an array with multiple values.
  5. Now issue the following command: MediaWiki\MediaWikiServices::getInstance()->getDBLoadBalancer()->getConnection( DB_REPLICA )->newSelectQueryBuilder()->from( 'page' )->select( 'page_namespace' )->distinct()->fetchRowCount();

Actual result

  1. Notice that ->fetchRowCount() always returns 1, however many namespaces you have pages in.

Expected result

  1. Notice that ->fetchRowCount() returns the length of the array returned by ->fetchFieldValues().

Software version

MediaWiki 1.41.0-alpha (47f5a23)

Debugging results

->fetchFieldValues() runs the following SQL command:

SELECT DISTINCT page_namespace AS `value` FROM `page`;

All good.

->fetchRowCount() runs the following SQL command:

SELECT COUNT(*) AS `rowcount` FROM (SELECT DISTINCT 1 FROM `page` WHERE (page_namespace IS NOT NULL)) `tmp_count`;

Of course, there’s only one distinct value of 1

Why doesn’t it do the following?

SELECT COUNT(*) AS `rowcount` FROM (SELECT DISTINCT page_namespace FROM `page`) `tmp_count`;

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Ladsgroup subscribed.

That's not really DBA stuff, I remove the tag but keep myself subscribed.

@tstarling Could you take a look at this to see if it's something related to the changes you have worked on recently?

I don't think it has ever worked. It can probably be made to work.

@Tacsipacsi Can you confirm that you really want to use fetchRowCount()? It is not supposed to replace all COUNT() expressions, it's mostly just for counting with a limit. You can use ->select( 'COUNT(DISTINCT page_namespace)' )->fetchField() if you want a count without a limit. I will add documentation along these lines.

The suggested query would defeat a limit because it will scan arbitrarily many rows with identical values.

Change 903359 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] SelectQueryBuilder: clarify fetchRowCount() use cases

https://gerrit.wikimedia.org/r/903359

Change 903360 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] Support for selectRowCount() with DISTINCT

https://gerrit.wikimedia.org/r/903360

Change 903359 merged by jenkins-bot:

[mediawiki/core@master] SelectQueryBuilder: clarify fetchRowCount() use cases

https://gerrit.wikimedia.org/r/903359

Change 903360 merged by jenkins-bot:

[mediawiki/core@master] Support for selectRowCount() with DISTINCT

https://gerrit.wikimedia.org/r/903360