Page MenuHomePhabricator

CVE-2025-32069: Wikitext stored XSS on filepages due to dangerous WBMI serialization
Closed, ResolvedPublicSecurity

Assigned To
Authored By
Dylsss
Mar 3 2025, 2:58 AM
Referenced Files
F58744955: T387691_2.patch
Mar 10 2025, 8:56 AM
F58744907: T387691.patch
Mar 10 2025, 8:56 AM
F58641598: image.png
Mar 7 2025, 3:35 PM
F58641567: 0001-SECURITY-Fix-XSS-vulnerability.patch
Mar 7 2025, 3:30 PM
F58579128: image.png
Mar 3 2025, 11:20 AM
F58579124: image.png
Mar 3 2025, 11:20 AM
Tokens
"Burninate" token, awarded by Bawolff.

Description

Paste this into file page wikitext

<div id="P2" data-property="P2" data-statements='[{"mainsnak":{"snaktype":"value","property":"P2","hash":"4fca852915edcd32a484898d3ac6774b7b253272","datavalue":{"value":{"entity-type":"item","numeric-id":2,"id":"Q2"},"type":"wikibase-entityid"}},"type":"statement","id":"M2$01ce104b-441a-67a3-edc4-17016345fb70","rank":"normal"}]' data-formatvalue='{"{\"value\":{\"entity-type\":\"property\",\"numeric-id\":2,\"id\":\"P2\"},\"type\":\"wikibase-entityid\"}":{"text/html":{"en":{"":"<nowiki><img src=x onerror=alert()></nowiki>"}},"text/plain":{"en":{"":"test"}}},"{\"value\":{\"entity-type\":\"item\",\"numeric-id\":2,\"id\":\"Q2\"},\"type\":\"wikibase-entityid\"}":{"text/html":{"en":{"P2":"test"}},"text/plain":{"en":{"P2":"dogtest"}}}}' class="wbmi-entityview-statementsGroup wbmi-entityview-statementsGroup-P2 oo-ui-layout oo-ui-panelLayout oo-ui-panelLayout-framed"></div>

Alert box appears.

I guess WBMI needs to deny these tags from being used in Wikitext. Might also be some prototype pollution-esque vulnerability

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Dylsss triaged this task as High priority.Mar 3 2025, 2:59 AM

According to https://gerrit.wikimedia.org/g/mediawiki/core/+/d38a67858dfb0ddea387d49720a8162540b579c3/includes/parser/Sanitizer.php#599

These attributes should be named with data-mw.

// data-mw and data-parsoid are reserved for parsoid.
// data-mw-<name here> is reserved for extensions (or core) if
// they need to communicate some data to the client and want to be	
// sure that it isn't coming from an untrusted user.

Ouch. CCing some people from T216130.

Given that these attributes are only used to populate a cache, and AFAICT the cache should fall back to the needed API calls if the cached data is missing, I think we can get away with just renaming the attributes to data-mw-* in PHP and JS so they’re protected by the sanitizer? Pages from the parser cache (with new JS code but old attributes in the cached HTML) will result in some additional API requests, but hopefully that should be okay.

The alternative would be to make the PHP code emit the new attributes, but let the JS code look for both attributes; wait for 30 days for the parser cache to expire, and hope that nobody notices the different attributes and deduces the still-open vulnerability from that; and only then make the JS code look for the new attributes exclusively. I don’t like the sound of that.

Given that these attributes are only used to populate a cache, and AFAICT the cache should fall back to the needed API calls if the cached data is missing, I think we can get away with just renaming the attributes to data-mw-* in PHP and JS so they’re protected by the sanitizer? Pages from the parser cache (with new JS code but old attributes in the cached HTML) will result in some additional API requests, but hopefully that should be okay.

Well, except that there are also other data-* attributes that we should ideally protect the same way, but that aren’t just used for caching. If I naively rename all of them, then “fresh” page views work, but new JS processing old parser output results in a structured data interface that looks like this:

image.png (137×826 px, 8 KB)

Instead of this (after a null edit):
image.png (282×819 px, 23 KB)

So we probably need to look more closely into how the different data-* attributes are used and which of them might be safe or not, and/or rope in the people who understand / still remember those attributes…

FWIW, this is the incomplete patch I’m trying out so far (definitely needs more work):

diff --git i/resources/base/ComponentWidget.js w/resources/base/ComponentWidget.js
index 8abff9e534..d7d2826992 100644
--- i/resources/base/ComponentWidget.js
+++ w/resources/base/ComponentWidget.js
@@ -562,11 +562,11 @@ ComponentWidget.prototype.getNumberOfEqualNodes = function ( one, two ) {
 					return oneNode.id === twoNode.id;
 				}
 
-				if ( oneNode.getAttribute( 'data-key' ) || twoNode.getAttribute( 'data-key' ) ) {
-					// nodes that have a data-key attribute must match
+				if ( oneNode.getAttribute( 'data-mw-key' ) || twoNode.getAttribute( 'data-mw-key' ) ) {
+					// nodes that have a data-mw-key attribute must match
 					// (similar to id, but doesn't have to be unique
 					// on the page, as long as it's unique in the template)
-					return oneNode.getAttribute( 'data-key' ) === twoNode.getAttribute( 'data-key' );
+					return oneNode.getAttribute( 'data-mw-key' ) === twoNode.getAttribute( 'data-mw-key' );
 				}
 
 				if ( oneNode.isEqualNode( twoNode ) ) {
diff --git i/resources/filepage/StatementPanel.js w/resources/filepage/StatementPanel.js
index 07445fbf3b..cac0bdb49b 100644
--- i/resources/filepage/StatementPanel.js
+++ w/resources/filepage/StatementPanel.js
@@ -37,8 +37,8 @@ StatementPanel = function StatementPanelConstructor( config ) {
 	// Mixin constructors
 	OO.ui.mixin.PendingElement.call( this, this.config );
 
-	if ( this.$element.attr( 'data-formatvalue' ) ) {
-		this.populateFormatValueCache( JSON.parse( this.$element.attr( 'data-formatvalue' ) || '{}' ) );
+	if ( this.$element.attr( 'data-mw-formatvalue' ) ) {
+		this.populateFormatValueCache( JSON.parse( this.$element.attr( 'data-mw-formatvalue' ) || '{}' ) );
 	}
 
 	this.licenseDialogWidget = new LicenseDialogWidget();
diff --git i/resources/filepage/init.js w/resources/filepage/init.js
index 3eed8a5f65..219a144d47 100644
--- i/resources/filepage/init.js
+++ w/resources/filepage/init.js
@@ -214,8 +214,8 @@
 		// Set up existing statement panels
 		existingStatementPanels = $statements.get().map( function ( element ) {
 			var $statement = $( element ),
-				propId = $statement.data( 'property' ),
-				statementsJson = JSON.parse( $statement.attr( 'data-statements' ) || '[]' ),
+				propId = $statement.data( 'mw-property' ),
+				statementsJson = JSON.parse( $statement.attr( 'data-mw-statements' ) || '[]' ),
 				data = deserializer.deserialize( statementsJson ),
 				statementPanel = createStatementsPanel(
 					$statement,
diff --git i/resources/statements/inputs/EntityAutocompleteInputWidget.js w/resources/statements/inputs/EntityAutocompleteInputWidget.js
index a09c865b39..7b76bd59de 100644
--- i/resources/statements/inputs/EntityAutocompleteInputWidget.js
+++ w/resources/statements/inputs/EntityAutocompleteInputWidget.js
@@ -216,7 +216,7 @@ EntityAutocompleteInputWidget.prototype.onMousedown = function ( e ) {
 		// window.open. This is a response to a mousedown event so it shouldn't
 		// trigger any popup blockers in modern browsers. For browsers set to
 		// prefer new tabs over new windows, this will open in a new tab.
-		window.open( e.currentTarget.dataset.url, '_blank' );
+		window.open( e.currentTarget.dataset.mwUrl, '_blank' );
 	}
 };
 
diff --git i/src/View/MediaInfoEntityStatementsView.php w/src/View/MediaInfoEntityStatementsView.php
index 0e6df996da..2ddfe71cec 100644
--- i/src/View/MediaInfoEntityStatementsView.php
+++ w/src/View/MediaInfoEntityStatementsView.php
@@ -261,9 +261,9 @@ private function getLayoutForProperty( $propertyIdString, array $statements ) {
 		] );
 		$panel->setAttributes(
 			[
-				'data-property' => $propertyIdString,
-				'data-statements' => json_encode( $serializedStatements ),
-				'data-formatvalue' => json_encode( $formatValueCache ),
+				'data-mw-property' => $propertyIdString,
+				'data-mw-statements' => json_encode( $serializedStatements ),
+				'data-mw-formatvalue' => json_encode( $formatValueCache ),
 			]
 		);
 		return $panel;
@@ -352,7 +352,7 @@ private function innerStatementDiv( Statement $statement ): Tag {
 
 		$guid = $statement->getGuid();
 		if ( $guid !== null ) {
-			$statementDiv->setAttributes( [ 'data-guid' => $guid ] );
+			$statementDiv->setAttributes( [ 'data-mw-guid' => $guid ] );
 		}
 
 		$mainSnakDiv = new Tag( 'div' );
diff --git i/templates/statements/inputs/EntityAutocompleteInputWidgetLabel.mustache+dom w/templates/statements/inputs/EntityAutocompleteInputWidgetLabel.mustache+dom
index 7320a630f0..7795747737 100644
--- i/templates/statements/inputs/EntityAutocompleteInputWidgetLabel.mustache+dom
+++ w/templates/statements/inputs/EntityAutocompleteInputWidgetLabel.mustache+dom
@@ -1,4 +1,4 @@
-<span class="wbmi-autocomplete-option" data-url={{url}}>
+<span class="wbmi-autocomplete-option" data-mw-url={{url}}>
 	<span class="wbmi-autocomplete-option__label">
 		{{label}}
 	</span>

If we can’t figure out a “proper” solution for handling old parser cache contents / data-* attributes, maybe we could do a nasty hack that just purges + reloads the file page as soon as someone clicks the structured data tab…

(Just to spell it out: I expect we can’t just wipe the parser cache wiki-wide, that would put too much load on the servers. Years ago we tried doing that for Wikidata via $wgCacheEpoch and if memory serves the outcome was “let’s never do that again”, and Commons gets even more pageviews than Wikidata.)

(And just to clarify: I’m not actively working on this at the moment, and I don’t plan to look much into it beyond the above initial inspection. From our side the responsible party for the affected code base is WMF rather than WMDE.)

Not all data attributes need to be converted to data-mw-*; those in e.g. ComponentWidget.js are generated and used by JS only.

I also think we can safely fallback to the original (non data-mw-* prefixed) attributes in cached cases, without having to purge the caches: I narrowed down the selector to target only the nodes in specific places (i.e. those generated by PHP in the expected/predictable spot; not those inserted by users that appear elsewhere on the page)

Thanks for also looking into this, @Lucas_Werkmeister_WMDE, your observations have been very useful :)

@matthiasmullie please don’t publish security fixes on Gerrit! https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseMediaInfo/+/1125394 – see https://www.mediawiki.org/wiki/Developing_security_patches

It’s especially bad for this to happen on a Friday, because now the vulnerability is public for anyone who looks at that change (with some guesswork / investigation as to how exactly to exploit it), yet we normally wouldn’t deploy the fix until Monday…

You'd think I should have known better, yet it didn't even occur to me this time! My apologies.

I've obscured it with another patchset & abandoned the patch (and will coordinate off-Gerrit review)

If anything else needs to happen to further purge that commit from gerrit, please let me know! cc @sbassett, @Reedy (FYI the risk is XSS on Commons file pages)

Not all data attributes need to be converted to data-mw-*; those in e.g. ComponentWidget.js are generated and used by JS only.

I also think we can safely fallback to the original (non data-mw-* prefixed) attributes in cached cases, without having to purge the caches: I narrowed down the selector to target only the nodes in specific places (i.e. those generated by PHP in the expected/predictable spot; not those inserted by users that appear elsewhere on the page)

Do you mean prefixing the selector with .wbmi-captions-header ~ ? Because i don't think that would be sufficient. Users could just add both divs.

Re Lucas:

(Just to spell it out: I expect we can’t just wipe the parser cache wiki-wide, that would put too much load on the servers. Years ago we tried doing that for Wikidata via $wgCacheEpoch and if memory serves the outcome was “let’s never do that again”, and Commons gets even more pageviews than Wikidata.)

In other cases the approach has been deploy patch with fallback code secretly for 30 days, until the cache naturally expires.

Not all data attributes need to be converted to data-mw-*; those in e.g. ComponentWidget.js are generated and used by JS only.

I also think we can safely fallback to the original (non data-mw-* prefixed) attributes in cached cases, without having to purge the caches: I narrowed down the selector to target only the nodes in specific places (i.e. those generated by PHP in the expected/predictable spot; not those inserted by users that appear elsewhere on the page)

Do you mean prefixing the selector with .wbmi-captions-header ~ ? Because i don't think that would be sufficient. Users could just add both divs.

Yes; those should be .wbmi-captions-header:eq( 0 ) ~ .wbmi-entityview-captionsPanel & .wbmi-structured-data-header:eq( 0 ) ~ .wbmi-entityview-statementsGroup (note the addition of :eq( 0 )).
Since the parsed wikitext contents are not added until after those elements have been created, that should effectively exclude those from possibly being called, no matter how they construct it - right?

I already mentioned this in the #mediawiki_security channel, but just to have it here as well: while the security fix is being worked on, I expect we could try to prevent it from being exploited by installing an AbuseFilter to block data-formatvalue= attributes in Wikitext. I don’t know whether this would be a good idea or not; I guess it depends on whether we think anyone else will have noticed the Gerrit change before it was obscured, and how we weigh that risk against the increased visibility (among sysops) of an AbuseFilter block. I’m leaving that decision to others – just mentioning it as an option.

Edit: The AbuseFilter would only be needed on Commons – WikibaseMediaInfo isn’t used on other wikis. (I also expect it’s rarely to never used on third-party wikis, FWIW.)

In other cases the approach has been deploy patch with fallback code secretly for 30 days, until the cache naturally expires.

Yes, but my worry is that the wiki is still vulnerable during those 30 days, while people looking at the on-wiki HTML might notice that the attributes have changed, and deduce the vulnerability and start to exploit it during that time.

In other cases the approach has been deploy patch with fallback code secretly for 30 days, until the cache naturally expires.

That would be necessary if there was a parser/sanitizer vulnerability that lets you add a self-contained exploit in wikitext, so just rendering the parsed wikitext in the browser results in an XSS.

But in this case, the actual vulnerability is in a JS asset, so just updating the asset and waiting a few mintues for ResourceLoader caches to expire should be enough, right? If we really want to keep falling back to the old attribute, could just strip HTML tags from it.

I already mentioned this in the #mediawiki_security channel, but just to have it here as well: while the security fix is being worked on, I expect we could try to prevent it from being exploited by installing an AbuseFilter to block data-formatvalue= attributes in Wikitext. I don’t know whether this would be a good idea or not; I guess it depends on whether we think anyone else will have noticed the Gerrit change before it was obscured, and how we weigh that risk against the increased visibility (among sysops) of an AbuseFilter block. I’m leaving that decision to others – just mentioning it as an option.

Edit: The AbuseFilter would only be needed on Commons – WikibaseMediaInfo isn’t used on other wikis. (I also expect it’s rarely to never used on third-party wikis, FWIW.)

It sounds like we want to go ahead with this (per further discussion in IRC), so to flesh this out a bit: I think we want to block data-property=, data-statements= and data-formatvalue= from added_lines. (Possibly case-insensitive? Not sure, to be honest.) I expect no legitimate uses for all three of those, and search shows no results for data-statements, data-property or data-formatvalue (except for false positives because the hyphen also matches spaces in the search).

But in this case, the actual vulnerability is in a JS asset, so just updating the asset and waiting a few mintues for ResourceLoader caches to expire should be enough, right?

I think that’s more or less what I suggested for data-formatvalue in T387691#10595540? But I haven’t checked yet whether it also applies to the other attributes.

If we really want to keep falling back to the old attribute, could just strip HTML tags from it.

That would be pointless IMHO – the real values (added by PHP code rather than wikitext) are supposed to contain (some) HTML. (Not onerror=alert() HTML, but I don’t know if we have a general-purpose HTML sanitizer available in JS.)

I don't think AbuseFilter is useful for this kind of thing. What about da{{empty}}ta-property=? Or {{this template resolves to "data"}}-property?

the real values (added by PHP code rather than wikitext) are supposed to contain (some) HTML.

So they will be slightly incorrect, sometimes. No big deal, compared to an XSS vulnerability. These are media statement labels, not exactly a critical user pathway.

Yes; those should be .wbmi-captions-header:eq( 0 ) ~ .wbmi-entityview-captionsPanel & .wbmi-structured-data-header:eq( 0 ) ~ .wbmi-entityview-statementsGroup (note the addition of :eq( 0 )).

That assumes nothing containing wikitext is included in the document above the mediainfo tab, which seems risky. What about indicators, for example?

@matthiasmullie @Lucas_Werkmeister_WMDE et al - Do you think you could reinstate https://gerrit.wikimedia.org/r/1125394 and get it reviewed and merged soon? If so, I can pick it to 1.44.0-wmf.19 and get it deployed in a bit. Despite the Friday deploy, that's probably the best strategy given the exposure with the gerrit change set (anyone can still see PS1) if we're fairly confident this would be a stable patch and the RL cache expirations wouldn't be that big of a deal.

I don't think AbuseFilter is useful for this kind of thing. What about da{{empty}}ta-property=? Or {{this template resolves to "data"}}-property?

Fair :( let’s shelve the AbuseFilter idea then.

the real values (added by PHP code rather than wikitext) are supposed to contain (some) HTML.

So they will be slightly incorrect, sometimes. No big deal, compared to an XSS vulnerability. These are media statement labels, not exactly a critical user pathway.

But we might as well not use the incorrect values at all and just get the correct ones from the API. WikibaseMediaInfo uses the data-formatvalue attributes as an optimization, to avoid having to query the API for the HTML in question; but I’m guessing the API will be able to shoulder the additional load from not being able to use the attributes for a little while.

@sbassett I’ll try to puzzle together an alternative version; I share @Bawolff’s and @Tgr’s concerns about the changed selector, so I don’t think we should go ahead with PS1 as-is.

Updated version of PS1 of that change:

I removed the fallback from data-mw-formatvalue to data-formatvalue – this is the really dangerous attribute, and the one where it’s safest to just leave it out (as the code can just fall back to the API instead). I’m hoping (but am not sure) that falling back to data-statements is safe (though potentially confusing); I also kept the selector change even if we’re not sure if it’s enough or not.

I’ll see if I can try it out later, but please review :)

I also rebased the patch to be based on wmf.19 – there have been some conflicting changes in master (let → const if I’m not mistaken), so it’ll need to be rebased again for wmf.20 later. But for the initial deployment today, the version based on wmf.19 should be more useful.

Okay, I think that patch is working well for me locally – the alert vanishes, and when the parser output is cached (i.e. with data-formatvalue rather than data-mw-formatvalue) I can see the extra wbformatvalue API requests in the network panel:

image.png (51×836 px, 16 KB)

Updated version of PS1 of that change:

CR+1 patch seems sane to me and I understand the data- to data-mw- swaps. Less certain on the additional .wbmi- changes. If someone else could maybe have a quick glance at that, then I think I can get this security-deployed soon.

Deployed the patch from T387691#10613624. Thanks @Lucas_Werkmeister_WMDE for getting an updated patch together and production-testing post-deployment. So far things look stable.

sbassett changed the task status from Open to In Progress.Mar 7 2025, 4:24 PM

Yeah, so far it’s behaving as I expected in production. Pageviews for stale parser output are making additional wbformatvalue API requests, and we can see that in Grafana, but I expect it won’t be bad enough to cause stability issues (wbformatvalue isn’t an especially expensive API request, and Action API Summary suggests it’s still only a small portion of API requests). It should settle down again over the next 30 days.

Tested and o/s'd XSS payload from the task description on commons (rev1, rev2)

Pageviews for stale parser output are making additional wbformatvalue API requests, and we can see that in Grafana, but I expect it won’t be bad enough to cause stability issues (wbformatvalue isn’t an especially expensive API request, and Action API Summary suggests it’s still only a small portion of API requests). It should settle down again over the next 30 days.

To bolster that claim about the cheapness of the API request a bit: over the past half hour or so, wbformatvalue now makes up 3% of API requests (up from 0% an hour before; in absolute terms, 4 K → 42 K, i.e. a factor of ca. 10×), but the load in terms of wall-clock time is only 1% (up from 0%). I’m relatively confident that this won’t explode over the weekend / coming weeks.

SecurityPatchBot raised the priority of this task from High to Unbreak Now!.Mar 8 2025, 12:51 AM

Patch 01-T387691.patch is currently failing to apply for the most recent code in the mainline branch of extensions/WikibaseMediaInfo. This is blocking MediaWiki release 1.44.0-wmf.20(T386215)

If the patch needs to be rebased

To unblock the release, a new version of the patch can be placed at the right location in the deployment server with the following Scap command:

REVISED_PATCH=<path_to_revised_patch>
scap update-patch --message-body 'Rebase to solve merge conflicts with mainline code' /srv/patches/1.44.0-wmf.20/extensions/WikibaseMediaInfo/01-T387691.patch "$REVISED_PATCH"

If the patch has been made public

To unblock the release, the patch can be removed for the right version from the deployment server with the following Scap command:

scap remove-patch --message-body 'Remove patch already made public' /srv/patches/1.44.0-wmf.20/extensions/WikibaseMediaInfo/01-T387691.patch

(Note that if patches for the version don't exist yet, they will be created and the patch you specified removed)

sbassett lowered the priority of this task from Unbreak Now! to High.EditedMar 8 2025, 1:10 AM

The above @SecurityPatchBot notice describes a known issue (@Lucas_Werkmeister_WMDE noted that the current 1.44.0-wmf.19 patch would need a rebase by next week). So we can deal with that on Monday or whenever as the security patch is still applied and working in Wikimedia production.

SecurityPatchBot raised the priority of this task from High to Unbreak Now!.Mar 9 2025, 12:50 AM

Patch 01-T387691.patch is currently failing to apply for the most recent code in the mainline branch of extensions/WikibaseMediaInfo. This is blocking MediaWiki release 1.44.0-wmf.20(T386215)

If the patch needs to be rebased

To unblock the release, a new version of the patch can be placed at the right location in the deployment server with the following Scap command:

REVISED_PATCH=<path_to_revised_patch>
scap update-patch --message-body 'Rebase to solve merge conflicts with mainline code' /srv/patches/1.44.0-wmf.20/extensions/WikibaseMediaInfo/01-T387691.patch "$REVISED_PATCH"

If the patch has been made public

To unblock the release, the patch can be removed for the right version from the deployment server with the following Scap command:

scap remove-patch --message-body 'Remove patch already made public' /srv/patches/1.44.0-wmf.20/extensions/WikibaseMediaInfo/01-T387691.patch

(Note that if patches for the version don't exist yet, they will be created and the patch you specified removed)

Dylsss lowered the priority of this task from Unbreak Now! to High.Mar 9 2025, 1:04 AM
SecurityPatchBot raised the priority of this task from High to Unbreak Now!.Mar 10 2025, 12:50 AM

Patch 01-T387691.patch is currently failing to apply for the most recent code in the mainline branch of extensions/WikibaseMediaInfo. This is blocking MediaWiki release 1.44.0-wmf.20(T386215)

If the patch needs to be rebased

To unblock the release, a new version of the patch can be placed at the right location in the deployment server with the following Scap command:

REVISED_PATCH=<path_to_revised_patch>
scap update-patch --message-body 'Rebase to solve merge conflicts with mainline code' /srv/patches/1.44.0-wmf.20/extensions/WikibaseMediaInfo/01-T387691.patch "$REVISED_PATCH"

If the patch has been made public

To unblock the release, the patch can be removed for the right version from the deployment server with the following Scap command:

scap remove-patch --message-body 'Remove patch already made public' /srv/patches/1.44.0-wmf.20/extensions/WikibaseMediaInfo/01-T387691.patch

(Note that if patches for the version don't exist yet, they will be created and the patch you specified removed)

sbassett lowered the priority of this task from Unbreak Now! to High.Mar 10 2025, 1:08 AM

Attached is a rebased copy of the currently-deployed patch:


It seems that the decision was made to forego falling back on existing (potentially unsafe) data-formatvalue attributes at the expense of more API requests, and that those additional requests do not appear to cause any issues.
If, however, we do feel like we would prefer to cut back down on those additional requests, I think we can add a simple check to confirm that the DOM does not contain additional elements than those expected, and simply abort otherwise. Worst case would then be that the structured data (captions/statements) on such pages become read-only (the JS doesn't kick in) for 30 days (or until the offending wikitext is removed), until parser caches have been refreshed and we can remove the fallback without impacting API requests.

Something like this:

+		// eslint-disable-next-line no-jquery/no-global-selector
+		if ( $( '.wbmi-structured-data-header' ).length > 1 || $( '.wbmi-captions-header' ).length > 1 ) {
+			// abort initialization if we encounter more than of the expected DOM
+			// structured; in which case wikitext has likely been crafted maliciously,
+			// and it may be unsafe to proceed
+			return;
+		}

and

 	if ( this.$element.attr( 'data-mw-formatvalue' ) ) {
 		this.populateFormatValueCache( JSON.parse( this.$element.attr( 'data-mw-formatvalue' ) || '{}' ) );
+	} else if ( this.$element.attr( 'data-formatvalue' ) ) {
+		// Fallback for when this attribute was named differently
+		// @see https://phabricator.wikimedia.org/T387691
+		this.populateFormatValueCache( JSON.parse( this.$element.attr( 'data-formatvalue' ) || '{}' ) );
 	}


Once again, my apologies for not having had the right reflex of not posting the fix publicly! And thanks a lot to all involved in getting a quick fix deployed while I was out!

Attached is a rebased copy of the currently-deployed patch:

Thanks, I'll get this uploaded to a deployment host today.

If, however, we do feel like we would prefer to cut back down on those additional requests, I think we can add a simple check to confirm that the DOM does not contain additional elements than those expected, and simply abort otherwise. Worst case would then be that the structured data (captions/statements) on such pages become read-only (the JS doesn't kick in) for 30 days (or until the offending wikitext is removed), until parser caches have been refreshed and we can remove the fallback without impacting API requests.

This could likely be done in a follow-up gerrit change set once we make the patch public. Generally, if we have a "good enough" solution that's stable in Wikimedia production, we'll just go with that until the issue is officially disclosed via a security release, etc.

The above @SecurityPatchBot notice describes a known issue (@Lucas_Werkmeister_WMDE noted that the current 1.44.0-wmf.19 patch would need a rebase by next week). So we can deal with that on Monday or whenever as the security patch is still applied and working in Wikimedia production.

Yeah, I thought we’d have time until Monday to put the rebase together, I didn’t think the bot would already yell three times during that time :D

Attached is a rebased copy of the currently-deployed patch:

Thanks!

It seems that the decision was made to forego falling back on existing (potentially unsafe) data-formatvalue attributes at the expense of more API requests, and that those additional requests do not appear to cause any issues.
If, however, we do feel like we would prefer to cut back down on those additional requests, I think we can add a simple check to confirm that the DOM does not contain additional elements than those expected, and simply abort otherwise. Worst case would then be that the structured data (captions/statements) on such pages become read-only (the JS doesn't kick in) for 30 days (or until the offending wikitext is removed), until parser caches have been refreshed and we can remove the fallback without impacting API requests.

I don’t know, that still feels risky to me… if we want to reduce the wbformatvalue requests, I would suggest another option, based on this:

If we can’t figure out a “proper” solution for handling old parser cache contents / data-* attributes, maybe we could do a nasty hack that just purges + reloads the file page as soon as someone clicks the structured data tab…

We could just purge the page in the background when we see the old parser cache (i.e. data-mw-formatvalue absent but data-formatvalue present) – the current page view would still fall back to calling the API, but the background purge would ensure that the next pageview will have the data-mw-formatvalue attribute available. On the other hand, that means more page purges, which aren’t free either.

AFAICT it still looks like the additional requests aren’t causing any issues, so I would just stick with the current approach and not change it at all. As far as I’m concerned, this task can now wait until T382326: Write and send supplementary release announcement for extensions and skins with security patches (1.39.12/1.42.6/1.43.1) (⇒ publish task and patches) and/or 7 April (parser cache should be expired ⇒ remove fallback code), whichever happens earlier.

Removing parent task again as the rebased change has been uploaded, so this shouldn’t block the train anymore.

@Lucas_Werkmeister_WMDE You mentioned T382326: Write and send supplementary release announcement for extensions and skins with security patches (1.39.12/1.42.6/1.43.1), but that's an Access Denied: Restricted Task for me so I have no insight into what it is, other than your comment describing it as "publish task and patches". If there's work I or anyone from the team should be involved in, please LMK! :)

It’s just the task to track the upcoming security release (1.39.12, 1.42.6, 1.43.1; expected to happen mid-April if I’m not mistaken), no action needed :)

FWIW while Lucas was writing a comment, I added you @matthiasmullie, and Mark as subscribers, which I believe gives you access to that task. I hope I didn't commit a crime against security policy

I hope I didn't commit a crime against security policy

Nope, that's fine!

Change #1133560 had a related patch set uploaded (by Mstyles; author: Matthias Mullie):

[mediawiki/extensions/WikibaseMediaInfo@master] SECURITY: Fix XSS vulnerability

https://gerrit.wikimedia.org/r/1133560

Change #1133560 merged by jenkins-bot:

[mediawiki/extensions/WikibaseMediaInfo@master] SECURITY: Fix XSS vulnerability

https://gerrit.wikimedia.org/r/1133560

Change #1133895 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Matthias Mullie):

[mediawiki/extensions/WikibaseMediaInfo@REL1_43] SECURITY: Fix XSS vulnerability

https://gerrit.wikimedia.org/r/1133895

Change #1133905 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Matthias Mullie):

[mediawiki/extensions/WikibaseMediaInfo@REL1_42] SECURITY: Fix XSS vulnerability

https://gerrit.wikimedia.org/r/1133905

Change #1133906 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Matthias Mullie):

[mediawiki/extensions/WikibaseMediaInfo@REL1_39] SECURITY: Fix XSS vulnerability

https://gerrit.wikimedia.org/r/1133906

Change #1133906 merged by jenkins-bot:

[mediawiki/extensions/WikibaseMediaInfo@REL1_39] SECURITY: Fix XSS vulnerability

https://gerrit.wikimedia.org/r/1133906

Change #1133905 merged by jenkins-bot:

[mediawiki/extensions/WikibaseMediaInfo@REL1_42] SECURITY: Fix XSS vulnerability

https://gerrit.wikimedia.org/r/1133905

Change #1133895 merged by jenkins-bot:

[mediawiki/extensions/WikibaseMediaInfo@REL1_43] SECURITY: Fix XSS vulnerability

https://gerrit.wikimedia.org/r/1133895

Mstyles renamed this task from Wikitext stored XSS on filepages due to dangerous WBMI serialization to CVE-2025-32069: Wikitext stored XSS on filepages due to dangerous WBMI serialization.Fri, Apr 11, 5:07 PM
Mstyles closed this task as Resolved.
Mstyles changed the visibility from "Custom Policy" to "Public (No Login Required)".
Mstyles changed the edit policy from "Custom Policy" to "All Users".