Page MenuHomePhabricator

CVE-2023-37303: Wikimedia\Rdbms\DBQueryDisconnectedError when blocking user
Closed, ResolvedPublicSecurity

Description

Steps to replicate the issue (include links if applicable):

What happens?:
Web browser spins for 10-15 seconds, then gives a DB error. Example:
[4f8a0ed5-a91b-4e6e-9888-255aee4b8684] 2023-06-06 20:14:03: Fatal exception of type "Wikimedia\Rdbms\DBQueryDisconnectedError"

What should have happened instead?:
User should have been blocked.

Traceback

[4f8a0ed5-a91b-4e6e-9888-255aee4b8684] /wiki/Special:Block/MalnadachBot   Wikimedia\Rdbms\DBQueryDisconnectedError: A connection error occurred during a query. 
Query: SELECT  cuc_ip  FROM `cu_changes` JOIN `actor` ON ((actor_id=cuc_actor))   WHERE actor_user = 41830889  ORDER BY cuc_timestamp DESC LIMIT 1  
Function: MediaWiki\CheckUser\Hooks::onPerformRetroactiveAutoblock
Error: 2006 MySQL server has gone away

from /srv/mediawiki/php-1.41.0-wmf.11/includes/libs/rdbms/database/Database.php(1296)
#0 /srv/mediawiki/php-1.41.0-wmf.11/includes/libs/rdbms/database/Database.php(1282): Wikimedia\Rdbms\Database->getQueryException(string, integer, string, string)
#1 /srv/mediawiki/php-1.41.0-wmf.11/includes/libs/rdbms/database/Database.php(1256): Wikimedia\Rdbms\Database->getQueryExceptionAndLog(string, integer, string, string)
#2 /srv/mediawiki/php-1.41.0-wmf.11/includes/libs/rdbms/database/Database.php(743): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
#3 /srv/mediawiki/php-1.41.0-wmf.11/includes/libs/rdbms/database/Database.php(1425): Wikimedia\Rdbms\Database->query(Wikimedia\Rdbms\Query, string)
#4 /srv/mediawiki/php-1.41.0-wmf.11/includes/libs/rdbms/database/DBConnRef.php(119): Wikimedia\Rdbms\Database->select(array, array, array, string, array, array)
#5 /srv/mediawiki/php-1.41.0-wmf.11/includes/libs/rdbms/database/DBConnRef.php(341): Wikimedia\Rdbms\DBConnRef->__call(string, array)
#6 /srv/mediawiki/php-1.41.0-wmf.11/includes/libs/rdbms/querybuilder/SelectQueryBuilder.php(656): Wikimedia\Rdbms\DBConnRef->select(array, array, array, string, array, array)
#7 /srv/mediawiki/php-1.41.0-wmf.11/extensions/CheckUser/src/Hooks.php(1190): Wikimedia\Rdbms\SelectQueryBuilder->fetchResultSet()
#8 /srv/mediawiki/php-1.41.0-wmf.11/includes/HookContainer/HookContainer.php(160): MediaWiki\CheckUser\Hooks->onPerformRetroactiveAutoblock(MediaWiki\Block\DatabaseBlock, array)
#9 /srv/mediawiki/php-1.41.0-wmf.11/includes/HookContainer/HookRunner.php(3036): MediaWiki\HookContainer\HookContainer->run(string, array)
#10 /srv/mediawiki/php-1.41.0-wmf.11/includes/block/DatabaseBlockStore.php(510): MediaWiki\HookContainer\HookRunner->onPerformRetroactiveAutoblock(MediaWiki\Block\DatabaseBlock, array)
#11 /srv/mediawiki/php-1.41.0-wmf.11/includes/block/DatabaseBlockStore.php(264): MediaWiki\Block\DatabaseBlockStore->doRetroactiveAutoblock(MediaWiki\Block\DatabaseBlock)
#12 /srv/mediawiki/php-1.41.0-wmf.11/includes/block/BlockUser.php(590): MediaWiki\Block\DatabaseBlockStore->insertBlock(MediaWiki\Block\DatabaseBlock)
#13 /srv/mediawiki/php-1.41.0-wmf.11/includes/block/BlockUser.php(532): MediaWiki\Block\BlockUser->placeBlockInternal(boolean)
#14 /srv/mediawiki/php-1.41.0-wmf.11/includes/block/BlockUser.php(466): MediaWiki\Block\BlockUser->placeBlockUnsafe(boolean)
#15 /srv/mediawiki/php-1.41.0-wmf.11/includes/specials/SpecialBlock.php(934): MediaWiki\Block\BlockUser->placeBlock(boolean)
#16 /srv/mediawiki/php-1.41.0-wmf.11/includes/specials/SpecialBlock.php(1048): MediaWiki\Specials\SpecialBlock::processFormInternal(array, User, MediaWiki\Block\UserBlockCommandFactory, MediaWiki\Block\BlockUtils)
#17 /srv/mediawiki/php-1.41.0-wmf.11/includes/htmlform/HTMLForm.php(744): MediaWiki\Specials\SpecialBlock->onSubmit(array, OOUIHTMLForm)
#18 /srv/mediawiki/php-1.41.0-wmf.11/includes/htmlform/HTMLForm.php(624): HTMLForm->trySubmit()
#19 /srv/mediawiki/php-1.41.0-wmf.11/includes/htmlform/HTMLForm.php(640): HTMLForm->tryAuthorizedSubmit()
#20 /srv/mediawiki/php-1.41.0-wmf.11/includes/specialpage/FormSpecialPage.php(224): HTMLForm->show()
#21 /srv/mediawiki/php-1.41.0-wmf.11/includes/specialpage/SpecialPage.php(701): FormSpecialPage->execute(string)
#22 /srv/mediawiki/php-1.41.0-wmf.11/includes/specialpage/SpecialPageFactory.php(1554): SpecialPage->run(string)
#23 /srv/mediawiki/php-1.41.0-wmf.11/includes/MediaWiki.php(328): MediaWiki\SpecialPage\SpecialPageFactory->executePath(string, RequestContext)
#24 /srv/mediawiki/php-1.41.0-wmf.11/includes/MediaWiki.php(925): MediaWiki->performRequest()
#25 /srv/mediawiki/php-1.41.0-wmf.11/includes/MediaWiki.php(579): MediaWiki->main()
#26 /srv/mediawiki/php-1.41.0-wmf.11/index.php(50): MediaWiki->run()
#27 /srv/mediawiki/php-1.41.0-wmf.11/index.php(46): wfIndexMain()
#28 /srv/mediawiki/w/index.php(3): require(string)
#29 {main}

Software version (skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

Event Timeline

I hypothesize that this may be related to autoblock - I can explain further, but the information is protected by ANPDP, so I'll have to share it privately somehow. Didn't want to turn this into a security-restricted ticket and share it here right off the bat in case I'm wrong, since it really is just a guess.

I've created an NDA-protected paste with the explanation (h/t @AntiCompositeNumber for the suggestion) at P48918.

Urbanecm set Security to Software security bug.EditedJun 6 2023, 8:55 PM
Urbanecm added projects: Security, Security-Team.
Urbanecm changed the visibility from "Public (No Login Required)" to "Custom Policy".
Urbanecm changed the subtype of this task from "Bug Report" to "Security Issue".
Urbanecm subscribed.

This is a serious DoS vector that affects blocking. Protecting.

First of all: I confirmed with @GeneralNotability over IRC that:

  1. The issue was reproducible (several attempts to block the user were made)
  2. The current solution (user blocked without autoblock enabled) is acceptable (this issue is not blocking an admin from blocking someone at this point).

According to Logstash, MediaWiki ran a SQL query, which failed to complete within a reasonable timeframe. A manual execution of the query took more than a minute (1 min 10 sec in particular). The query was:

SELECT  cuc_ip  FROM `cu_changes` JOIN `actor` ON ((actor_id=cuc_actor))   WHERE actor_user = 41830889  ORDER BY cuc_timestamp DESC LIMIT 1;

As @GeneralNotability said in P48918, this is happening because the user has a lot of edits, and scanning the cu_changes table. The problematic code is within CheckUser (in the onPerformRetroactiveAutoblock hook in particular; see trackback).

This query is significantly faster if the ordering is based on cuc_id (4 seconds) rather than cuc_timestamp (over a minute). Since cuc_id is consecutively assigned, it should roughly equal to the same thing, AFAICS.

I'm wondering whether this should be fixed by avoiding the ordering in the query (which is the slow part). Do we really need the last used IP? Maybe we really should just autoblock all the IPs?

FYI @Zabe @Dreamy_Jazz

Urbanecm edited projects, added CheckUser; removed MediaWiki-Blocks.
Urbanecm added a project: MediaWiki-Blocks.

I would not be surprised if the checkuser extension were sensitive to a similar DOS attack. If not in the PHP backend itself, then likely in the javascript which builds the summary table.

I would not be surprised if the checkuser extension were sensitive to a similar DOS attack. If not in the PHP backend itself, then likely in the javascript which builds the summary table.

Those results are limited to 5,000 IPs / UAs at most, but haven't tested how that would affect things.

This query is significantly faster if the ordering is based on cuc_id (4 seconds) rather than cuc_timestamp (over a minute). Since cuc_id is consecutively assigned, it should roughly equal to the same thing, AFAICS.

I'm wondering whether this should be fixed by avoiding the ordering in the query (which is the slow part). Do we really need the last used IP? Maybe we really should just autoblock all the IPs?

FYI @Zabe @Dreamy_Jazz

I wonder if this query would run faster if it was forced to use the correct index? If I understand correctly, a similar SQL query with a force index statement is used by Special:CheckUser to generate the 'get edits' results. Has anyone tried running a CheckUser on the bot account and seeing if the query fails?

Using cuc_id as the ordering should be okay, but I'm unsure it would work well.

Autoblocking all IPs won't be a good idea. That could mean IPs used several months ago are autoblocked, which could have been reassigned. For highly dynamic IPs this could mean 50-100 IPs being autoblocked.

Okay. I've run Special:CheckUser on MalnadachBot:

  • Get IPs:
    • Loads in a reasonable time.
  • Get edits:
    • First page of results loads slowly, but does load without major issue (summary table lags the page a bit)
    • Asking for a second page by clicking "next 5,000" gives the following:

image.png (316×1 px, 51 KB)

  • Get users on a /64
    • First page loads very quickly.
    • Attempting to move to the second page by clicking "next 5,000" gives a DB error, but this DB error appears very quickly:

image.png (357×1 px, 52 KB)

Trying with Special:Investigate the following occurs:

  • Checking the user first loads the "IPs & User agents" tab which shows a banner indicating the results are limited, however, loads in a reasonable time

image.png (184×1 px, 34 KB)

  • Clicking "Timeline" gives the following error:

image.png (348×1 px, 51 KB)

The explanation as to why this occurs is almost completely for the investigate and blocking issue because this bot account has made hundred of thousands of edits in the last 3 months.

I wonder if this query would run faster if it was forced to use the correct index? If I understand correctly, a similar SQL query with a force index statement is used by Special:CheckUser to generate the 'get edits' results. Has anyone tried running a CheckUser on the bot account and seeing if the query fails?

Of course. For some reason I thought the query uses the correct index already. SELECT cuc_ip FROM cu_changes FORCE INDEX (cuc_actor_ip_time) JOIN actor ON ((actor_id=cuc_actor)) WHERE actor_user = 41830889 ORDER BY cuc_timestamp DESC LIMIT 1 works like a charm. Thanks!

Patch below:

Autoblocking all IPs won't be a good idea. That could mean IPs used several months ago are autoblocked, which could have been reassigned. For highly dynamic IPs this could mean 50-100 IPs being autoblocked.

Makes sense. A way to resolve this concern would be to use all IPs from the last day/week/similar, but given what you suggested above, I think that doesn't need to be changed here.

Patch below:

Testing this patch now.

Adding Anti-Harassment as this affects Special:Investigate's Timeline mode.

Patch below:

+2

Thanks. Deploying:

02:01 <urbanecm> !log Deploying security patch for T338276
02:01 <+stashbot> Logged the message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log
sbassett changed the task status from Open to In Progress.Jun 7 2023, 6:18 PM
sbassett triaged this task as Medium priority.
sbassett moved this task from Security Patch To Deploy to Watching on the Security-Team board.
sbassett changed Author Affiliation from N/A to Wikimedia Communities.
sbassett changed Risk Rating from N/A to Medium.

Okay. I've run Special:CheckUser on MalnadachBot:

  • Get IPs:
    • Loads in a reasonable time.
  • Get edits:
    • First page of results loads slowly, but does load without major issue (summary table lags the page a bit)
    • Asking for a second page by clicking "next 5,000" gives the following:

image.png (316×1 px, 51 KB)

This has been fixed by T338287.

  • Get users on a /64
    • First page loads very quickly.
    • Attempting to move to the second page by clicking "next 5,000" gives a DB error, but this DB error appears very quickly:

image.png (357×1 px, 52 KB)

This has also been fixed by T338287.

Trying with Special:Investigate the following occurs:

  • Checking the user first loads the "IPs & User agents" tab which shows a banner indicating the results are limited, however, loads in a reasonable time

image.png (184×1 px, 34 KB)

  • Clicking "Timeline" gives the following error:

image.png (348×1 px, 51 KB)

The explanation as to why this occurs is almost completely for the investigate and blocking issue because this bot account has made hundred of thousands of edits in the last 3 months.

The Investigate issue has not been addressed, though a separate task for that issue would probably make sense. I will file another task for this.

All other issues raised when running checks have therefore been addressed. While the results do load slowly, they do not meet any limits. Addressing the speed of the checkuser helper table could probably be done by making the script be given the data pre-complied, so that it can be displayed instead of generated client side.

I've created T338419 for the remaining issue with Special:Investigate.

For this task, fixes should be backported for all currently supported release versions. Fixes for 1.35, 1.38 and 1.39 will need a modified patch as they did not have the SelectQueryBuilder to make queries and/or need to use a differently named index.

Change 932823 had a related patch set uploaded (by Mstyles; author: Urbanecm):

[mediawiki/extensions/CheckUser@master] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/932823

Change 933629 had a related patch set uploaded (by Dreamy Jazz; author: Urbanecm):

[mediawiki/extensions/CheckUser@REL1_40] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/933629

Change 933687 had a related patch set uploaded (by Dreamy Jazz; author: Urbanecm):

[mediawiki/extensions/CheckUser@REL1_39] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/933687

Change 933688 had a related patch set uploaded (by Dreamy Jazz; author: Urbanecm):

[mediawiki/extensions/CheckUser@REL1_38] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/933688

Change 933689 had a related patch set uploaded (by Dreamy Jazz; author: Urbanecm):

[mediawiki/extensions/CheckUser@REL1_35] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/933689

Change 933629 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@REL1_40] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/933629

Change 933688 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@REL1_38] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/933688

Change 933687 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@REL1_39] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/933687

Change 932823 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/932823

Change 933689 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@REL1_35] SECURITY: Close a DoS vector by an index hint

https://gerrit.wikimedia.org/r/933689

Fix merged into the main branch and also backported to supported release versions. Therefore, this can be marked as resolved.

Fix merged into the main branch and also backported to supported release versions. Therefore, this can be marked as resolved.

Any reason to keep this task private? I'm currently not seeing one.

Fix merged into the main branch and also backported to supported release versions. Therefore, this can be marked as resolved.

Any reason to keep this task private? I'm currently not seeing one.

I see no need to keep this private either.

sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".Jun 29 2023, 6:24 PM
Mstyles renamed this task from Wikimedia\Rdbms\DBQueryDisconnectedError when blocking user to CVE-2023-37303: Wikimedia\Rdbms\DBQueryDisconnectedError when blocking user.Jun 30 2023, 5:48 PM