Page MenuHomePhabricator

CVE-2024-40613: Evil regex used to process gadget definitions
Closed, ResolvedPublicSecurity

Description

Upon reporting an evil regex in an unrelated extension, I was informed that it came from the Gadgets extension. Looking at its code, I have been able to identify two more cases of such regex being used by the extension's gadget definition repo. These regex are:

(\r\n|\r|\n)+
^==+ *([^*:\s|]+?)\s*==+\s*$
^\*+ *([a-zA-Z](?:[-_:.\w ]*[a-zA-Z0-9])?)(\s*\[.*?\])?\s*((\|[^|]*)+)\s*$

These could potentially be exploited to cause ReDOS via maliciously crafted input that does not match the regex.

Event Timeline

Is this really a security issue? Only people with interface-admin rights can edit this page. I think someone trusted enough to decide what javascript to run for all users is also trusted enough not to try a ReDOS attack. Not to mention the mitigating factor of the PCRE backtrackling limit.

[To be clear, still makes sense to change the regexes since it is easy to do so. At least the first & third one. Maybe I'm missing it, but i don't see why the second one is vulnerable and I'm a bit doubtful about the first one]

For (\r\n|\r|\n)+ how about [\r\n]+?

Actually, looking closer, I do not think any of these regexes are actually evil. It doesn't look like to me that any of them would have exponential backtrack behaviour (I'm not an expert on ReDOS and don't have super good intuition on this class of vulnerability, so i may be mistaken).

First one has nothing after the repeated subexpression that can possibly not match to force backtracking

Second one, perhaps it would involve polynomial backtracking, but i fail to see how this could induce exponential backtracking.

third one - it doesn't seem there is any way for \s*$ to not match in order to trigger backtracking.

[Alas I cannot use my 1984 joke about making evil regexes double-plus-good with possesive quantifiers]

For (\r\n|\r|\n)+ how about [\r\n]+?

I don't believe the original version is vulnerable, but [\r\n]+ would probably be simpler to understand anyways, and definitely not be vulnerable.

For the last one, which i also don't believe to be vulnerable, we could potentially change it to be ^\*+ *([a-zA-Z](?:[-_:.\w ]*[a-zA-Z0-9])?)(\s*\[.*?\])?\s*((\|[^|]*)++)\s*$ (note the extra + sign towards the end) which might make it easier to see at a glance it is safe. [Assuming I am not missing the reason why it is vulnerable]

Just a note, recheck (which checks for more than problematic star heights, backtracking, etc.) came up with the following findings:

  • (\r\n|\r|\n)+ - safe
  • ^==+ *([^*:\s|]+?)\s*==+\s*$ - vulnerable, 3rd order polynomial
  • ^\*+ *([a-zA-Z](?:[-_:.\w ]*[a-zA-Z0-9])?)(\s*\[.*?\])?\s*((\|[^|]*)+)\s*$ - safe

First one has nothing after the repeated subexpression that can possibly not match to force backtracking

It indeed looks like a false positive, apologies.

Second one, perhaps it would involve polynomial backtracking, but i fail to see how this could induce exponential backtracking.

I am not going to claim to be an expert on the difference between these two (as in how these look) but either way, it is still evil. You should be able to reproduce the issue with something like a fair number of =s followed by something that does not match.

third one - it doesn't seem there is any way for \s*$ to not match in order to trigger backtracking.

Yeah, I am not sure how to trigger it either but improving it to be efficient can't hurt.

Is this really a security issue? Only people with interface-admin rights can edit this page. I think someone trusted enough to decide what javascript to run for all users is also trusted enough not to try a ReDOS attack. Not to mention the mitigating factor of the PCRE backtrackling limit.

I agree but you have made a pretty clear case for fixing this yourself. :)

For (\r\n|\r|\n)+ how about [\r\n]+?

+1

Just a note, recheck (which checks for more than problematic star heights, backtracking, etc.) came up with the following findings:

That's useful, thanks.

Just a note, recheck (which checks for more than problematic star heights, backtracking, etc.) came up with the following findings:

  • (\r\n|\r|\n)+ - safe
  • ^==+ *([^*:\s|]+?)\s*==+\s*$ - vulnerable, 3rd order polynomial
  • ^\*+ *([a-zA-Z](?:[-_:.\w ]*[a-zA-Z0-9])?)(\s*\[.*?\])?\s*((\|[^|]*)+)\s*$ - safe

I imagine you would need an input quite a bit bigger than $wgMaxArticleSize before third order polynomial became problematic

I imagine you would need an input quite a bit bigger than $wgMaxArticleSize before third order polynomial became problematic

Actually, I just tested this and a rather small input was required to trigger catastrophic backtracking.

I imagine you would need an input quite a bit bigger than $wgMaxArticleSize before third order polynomial became problematic

Perhaps... unless an attacker is also aware of T363020.

In T363773#9759397, @R4356th wrote:

Actually, I just tested this and a rather small input was required to trigger catastrophic backtracking.

That's interesting.

Ah, i guess i was wrong here.

It seems like the cross over point where this becomes problematic is around 200KB

$str = "== " . str_repeat( '=', 200000 ) . ' ==';
$time1 = microtime( true );
preg_match( '/^==+ *([^*:\s|]+?)\s*==+\s*$/', $str ) or die( "backtrack limit hit" );
$time2 = microtime( true );

echo "Time is " . ( $time2-$time1 );

Time is 12.122364044189

[Surprisingly, pcre.backtrack_limit didn't seem to kick in as fast as you would hope, so may not mitigate this one as much as I naively expected]


A better version that i think is equivalent is:

/^==+ *([^*:\s|]+)\s*(?<!=)==+\s*$/

This is still vulnerable according to the redos checker (2nd order poly). However when i tried to actually test it, I wasn't able to really trigger a dos even when giving it 100's of mb of data.

A better version that i think is equivalent is:

/^==+ *([^*:\s|]+)\s*(?<!=)==+\s*$/

This is still vulnerable according to the redos checker (2nd order poly). However when i tried to actually test it, I wasn't able to really trigger a dos even when giving it 100's of mb of data.

In the absence of a better alternative, +1.

I do wonder what the WMF's PCRE backtrack limit is, however.

If anyone wants to write a patch with @Bawolff enhanced regex to address these issues, we would be pleased to review it and deploy it.

Apologies for the late response.

I do wonder what the WMF's PCRE backtrack limit is, however.

I believe it is 5000000 https://github.com/wikimedia/operations-deployment-charts/blob/master/charts/mediawiki/values.yaml#L195

Thank you. I think the regex should hold up fine.

If anyone wants to write a patch with @Bawolff enhanced regex to address these issues, we would be pleased to review it and deploy it.

I believe we could move ahead with this, @Bawolff.

In T363773#9788169, @R4356th wrote:

If anyone wants to write a patch with @Bawolff enhanced regex to address these issues, we would be pleased to review it and deploy it.

I believe we could move ahead with this, @Bawolff.

A basic patch implementing @Bawolff's new regexp from above:


I feel like this is likely low-risk to where it could just go through gerrit since, to exploit this on Wikimedia projects, one would need to first compromise an int-admin account in order to edit MediaWiki:Gadgets-definition.

sbassett changed the task status from Open to In Progress.May 14 2024, 7:42 PM
sbassett triaged this task as Low priority.
sbassett moved this task from Watching to In Progress on the Security-Team board.
sbassett changed Risk Rating from N/A to Low.
sbassett added a subscriber: gerritbot.

A basic patch implementing @Bawolff's new regexp from above:


I feel like this is likely low-risk to where it could just go through gerrit since, to exploit this on Wikimedia projects, one would need to first compromise an int-admin account in order to edit MediaWiki:Gadgets-definition.

+1

Change #1030565 had a related patch set uploaded (by SBassett; author: SBassett):

[mediawiki/extensions/Gadgets@master] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1030565

Change #1030565 merged by jenkins-bot:

[mediawiki/extensions/Gadgets@master] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1030565

Change #1036652 had a related patch set uploaded (by SBassett; author: SBassett):

[mediawiki/extensions/Gadgets@REL1_42] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1036652

Change #1036653 had a related patch set uploaded (by SBassett; author: SBassett):

[mediawiki/extensions/Gadgets@REL1_41] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1036653

Change #1036654 had a related patch set uploaded (by SBassett; author: SBassett):

[mediawiki/extensions/Gadgets@REL1_40] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1036654

Change #1036655 had a related patch set uploaded (by SBassett; author: SBassett):

[mediawiki/extensions/Gadgets@REL1_39] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1036655

Change #1036655 merged by jenkins-bot:

[mediawiki/extensions/Gadgets@REL1_39] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1036655

Change #1036652 merged by jenkins-bot:

[mediawiki/extensions/Gadgets@REL1_42] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1036652

Change #1036654 merged by jenkins-bot:

[mediawiki/extensions/Gadgets@REL1_40] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1036654

Change #1036653 merged by jenkins-bot:

[mediawiki/extensions/Gadgets@REL1_41] SECURITY: Improve regular expression performance

https://gerrit.wikimedia.org/r/1036653

mmartorana renamed this task from Evil regex used to process gadget definitions to CVE-2024-40613: Evil regex used to process gadget definitions.Jul 8 2024, 5:38 PM
mmartorana changed the visibility from "Custom Policy" to "Public (No Login Required)".

Just to clarify, despite being disclosed and announced today, Gadgets is a bundled extension, so the fix was released as part of MediaWiki 1.39.8 / 1.40.4 / 1.41.2 / 1.42.1.