Page MenuHomePhabricator

Aggregategroups Action API module allows deleting translatable page metadata for any group without trace (CVE-2021-36129)
Closed, ResolvedPublicSecurity

Description

While investigating T282905: Translate syntax version update and translation-aware transclusion lost I audited all code manipulating translate_metadata table. There is a small utility class TranslateMetadata. All writes to this table go through that class (there are a couple places where the table is read directly): https://codesearch.wmcloud.org/search/?q=translate_metadata&i=nope&files=&excludeFiles=&repos=

There is TranslateMetadata::deleteGroup that is only called from ApiAggregateGroups. That API module does not validate the parameter for aggregategroup when action=remove. Only restriction is that this module requires the translate-manage right. We have little above 200 people with that right on MetaWiki: https://meta.wikimedia.org/w/index.php?title=Special:ListUsers&offset=&limit=500&group=translationadmin

I was able to confirm that deletion of any group's metadata is possible using this module. The code has innocent looking // @todo Logging, which is really unfortunate in this case, because I can't for sure say that this is or is not the cause for the parent issue.

It seems the most likely cause, however, by ruling out other possibilities:

(1) The value was never saved in the first place. This is implausible because we have the log entries, other metadata in other tables and nothing in Logstash. Why would writes to this table silently fail, but not the other writes? In addition achive.org seems to confirm that the metadata was saved and was effective at some point.
(2) The value was deleted by calling TranslateMetadata::set with false as the value. When marking the page for translation, type declarations would prevent false for appearing for all the different tmd_key values.
(3) Page is moved or deleted. This would certainly leave log entries and other visible traces.


I think it would be good to silently patch do=remove for action=aggregategroups to do nothing but add logging that would identify the attacker (if any).

It would be really helpful if we could find DELETE statements for translate_metadata from binlogs/query logs (if we have any). This would help to find the extent of this issue and hopefully confirm the cause too.

Event Timeline

Marostegui added a subscriber: LSobanski.

If you have some concrete dates and a wiki, we can check binlogs and backups to see the status of the data for that given wiki.

With the Wikimedia CEE Spring 2021 page the deletion of the rows (where translate_metadata.tmd_group='page-Wikimedia CEE Spring 2021') would need to have happened between 2021-05-01T23:30:30Z and today on metawiki. Is that too long of a period?

We could probably scan the backups from that day till the last one and see on which one the page exists and on which one the page is no longer there. Once we have that narrowed, we could go ahead and check the binlogs around those days to find out exactly when it was deleted.

Otherwise it will be hard to scan all those generated binlogs.
@jcrespo could you help out here?

Hey, @Nikerabbit could you please confirm the request, as I think I understand the overall issue, but want to have 100% clear the request.

  • There were (presumably, that is the question) some DELETE statements ran against the s7 metadata table metawiki.translate_metadata and you want to know any possible details about them. In particular -but there could be more- one was presumably run between 2021-05-01T23:30:30Z and 2021-05-16T00:00:00Z. If I run select count(*) FROM translate_metadata where translate_metadata.tmd_group='page-Wikimedia CEE Spring 2021' I get 0 results, but at some point between these 2 dates it should return some results. You want to know when that happened, if it did, and the context (e.g. api call run/any other information about related queries in the transaction).

@jcrespo Yep that sounds correct. If that query returns results at some time between those dates, we can rule out my possible cause number 1. If you find details of a DELETE query, I might be able to rule out other possible cases. This table should have very few DELETE queries overall, it's mostly updated with REPLACE (insert if missing) queries.

live results:
mysql.py -h db1116:3317 metawiki -e "SELECT * FROM translate_metadata ORDER BY tmd_group, tmd_key" | grep 'page-Wikimedia CEE Spring 2021' ~> 0 results
backup 11 May:
zgrep 'page-Wikimedia CEE Spring 2021' dump.s7.2021-05-11--00-00-02/metawiki.translate_metadata.sql.gz ~> 0 results
backup 4 May:
zgrep 'page-Wikimedia CEE Spring 2021' dump.s7.2021-05-04--02-24-36/metawiki.translate_metadata.sql.gz ~> 3 results
("page-Wikimedia CEE Spring 2021","maxid","31"),
("page-Wikimedia CEE Spring 2021","transclusion","1"),
("page-Wikimedia CEE Spring 2021","version","2"),

So a delete, replace or update happened between (approximately, the backup dates are not accurate to the second), between 2021-05-11 00:00:02 and 2021-05-04 02:24:36. In binlog coordinates that is between db1136-bin.002276:170694204 and db1136-bin.002293:1027651177.

The following events were obtained from grepping the binlogs:

mysqlbinlog db1136-bin.002286 | grep -C 10 'page-Wikimedia CEE Spring 2021'
COMMIT/*!*/;
# at 893088481
#210508  6:46:37 server id 171978861  end_log_pos 893088519 	GTID 171978861-171978861-202827326 trans
/*!100001 SET @@session.gtid_seq_no=202827326*//*!*/;
BEGIN
/*!*/;
# at 893088519
#210508  6:46:37 server id 171978861  end_log_pos 893088747 	Query	thread_id=938197070	exec_time=0	error_code=0
use `metawiki`/*!*/;
SET TIMESTAMP=1620456397/*!*/;
REPLACE /* ApiGroupReview::changeState  */ INTO `translate_groupreviews` (tgr_group,tgr_lang,tgr_state) VALUES ('page-Wikimedia CEE Spring 2021','tr','progress')
/*!*/;
# at 893088747
#210508  6:46:37 server id 171978861  end_log_pos 893088775 	Intvar
SET INSERT_ID=41213765/*!*/;
# at 893088775
#210508  6:46:37 server id 171978861  end_log_pos 893089272 	Query	thread_id=938197070	exec_time=0	error_code=0
SET TIMESTAMP=1620456397/*!*/;
INSERT /* ManualLogEntry::insert  */ INTO `logging` (log_type,log_action,log_timestamp,log_actor,log_namespace,log_title,log_page,log_params,log_comment_id) VALUES ('translationreview','group','20210508064637',24,-1,'Translate/page-Wikimedia_CEE_Spring_2021',0,'a:4:{s:11:\"4::language\";s:2:\"tr\";s:14:\"5::group-label\";s:25:\"Wikimedia CEE Spring 2021\";s:12:\"6::old-state\";b:0;s:12:\"7::new-state\";s:8:\"progress\";}','2')
/*!*/;
# at 893089272
--
use `metawiki`/*!*/;
SET TIMESTAMP=1620456397/*!*/;
INSERT /* RecentChange::save  */ INTO `recentchanges` (rc_timestamp,rc_namespace,rc_title,rc_type,rc_source,rc_minor,rc_this_oldid,rc_last_oldid,rc_bot,rc_ip,rc_patrolled,rc_new,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,rc_comment_id,rc_actor) VALUES ('20210508064637',-1,'Translate/page-Wikimedia_CEE_Spring_2021',3,'mw.log',0,0,0,1,'10.64.16.68',2,0,NULL,NULL,0,41213765,'translationreview','group','a:4:{s:11:\"4::language\";s:2:\"tr\";s:14:\"5::group-label\";s:25:\"Wikimedia CEE Spring 2021\";s:12:\"6::old-state\";b:0;s:12:\"7::new-state\";s:8:\"progress\";}','2',24)
/*!*/;
# at 893090767
#210508  6:46:37 server id 171978861  end_log_pos 893090795 	Intvar
SET INSERT_ID=46932524/*!*/;
# at 893090795
#210508  6:46:37 server id 171978861  end_log_pos 893091471 	Query	thread_id=938197070	exec_time=0	error_code=0
SET TIMESTAMP=1620456397/*!*/;
INSERT /* MediaWiki\CheckUser\Hooks::updateCheckUserData  */ INTO `cu_changes` (cuc_namespace,cuc_title,cuc_minor,cuc_user,cuc_user_text,cuc_actiontext,cuc_comment,cuc_this_oldid,cuc_last_oldid,cuc_type,cuc_timestamp,cuc_ip,cuc_ip_hex,cuc_xff,cuc_xff_hex,cuc_agent,cuc_page_id) VALUES (-1,'Translate/page-Wikimedia_CEE_Spring_2021',0,1058846,'FuzzyBot','FuzzyBot changed the state of Turkish translations of [[Special:Translate/page-Wikimedia CEE Spring 2021|Wikimedia CEE Spring 2021]] from (unset) to In progress','',0,0,3,'20210508064637','10.64.16.68','0A401044',0,NULL,'ChangePropagation-JobQueue/WMF',0)
/*!*/;
# at 893091471
#210508  6:46:37 server id 171978861  end_log_pos 893091498 	Xid = 12009550938
COMMIT/*!*/;
# at 893091498
#210508  6:46:37 server id 171978861  end_log_pos 893091536 	GTID 171978861-171978861-202827330 trans
/*!100001 SET @@session.gtid_seq_no=202827330*//*!*/;
BEGIN
/*!*/;
# at 893091536
--
/*!*/;
# at 893110098
#210508  6:46:37 server id 171978861  end_log_pos 893110136 	GTID 171978861-171978861-202827352 trans
/*!100001 SET @@session.gtid_seq_no=202827352*//*!*/;
BEGIN
/*!*/;
# at 893110136
#210508  6:46:37 server id 171978861  end_log_pos 893110403 	Query	thread_id=938197022	exec_time=0	error_code=0
use `metawiki`/*!*/;
SET TIMESTAMP=1620456397/*!*/;
REPLACE /* MessageGroupStats::queueUpdates  */ INTO `translate_groupstats` (tgs_group,tgs_lang,tgs_total,tgs_translated,tgs_fuzzy,tgs_proofread) VALUES ('page-Wikimedia CEE Spring 2021','tr',32,1,0,0)
/*!*/;
# at 893110403
#210508  6:46:37 server id 171978861  end_log_pos 893110430 	Xid = 12009551293
COMMIT/*!*/;
# at 893110430
#210508  6:46:37 server id 171978861  end_log_pos 893110468 	GTID 171978861-171978861-202827353 trans
/*!100001 SET @@session.gtid_seq_no=202827353*//*!*/;
BEGIN
/*!*/;
# at 893110468
--
COMMIT/*!*/;
# at 898808003
#210508  6:50:01 server id 171978861  end_log_pos 898808041 	GTID 171978861-171978861-202837895 trans
/*!100001 SET @@session.gtid_seq_no=202837895*//*!*/;
BEGIN
/*!*/;
# at 898808041
#210508  6:50:01 server id 171978861  end_log_pos 898808308 	Query	thread_id=938216484	exec_time=0	error_code=0
use `metawiki`/*!*/;
SET TIMESTAMP=1620456601/*!*/;
REPLACE /* MessageGroupStats::queueUpdates  */ INTO `translate_groupstats` (tgs_group,tgs_lang,tgs_total,tgs_translated,tgs_fuzzy,tgs_proofread) VALUES ('page-Wikimedia CEE Spring 2021','tr',32,1,0,0)
/*!*/;
# at 898808308
#210508  6:50:01 server id 171978861  end_log_pos 898808335 	Xid = 12009806583
COMMIT/*!*/;
# at 898808335
#210508  6:50:01 server id 171978861  end_log_pos 898808373 	GTID 171978861-171978861-202837896 trans
/*!100001 SET @@session.gtid_seq_no=202837896*//*!*/;
BEGIN
/*!*/;
# at 898808373
--
COMMIT/*!*/;
# at 1013600551
#210508  7:49:13 server id 171978861  end_log_pos 1013600589 	GTID 171978861-171978861-203041619 trans
/*!100001 SET @@session.gtid_seq_no=203041619*//*!*/;
BEGIN
/*!*/;
# at 1013600589
#210508  7:49:11 server id 171978861  end_log_pos 1013600791 	Query	thread_id=938542235	exec_time=0	error_code=0
use `metawiki`/*!*/;
SET TIMESTAMP=1620460151/*!*/;
DELETE /* TranslateMetadata::set  */ FROM `translate_metadata` WHERE tmd_group = 'page-Wikimedia CEE Spring 2021' AND tmd_key = 'maxid'
/*!*/;
# at 1013600791
#210508  7:49:11 server id 171978861  end_log_pos 1013601001 	Query	thread_id=938542235	exec_time=0	error_code=0
SET TIMESTAMP=1620460151/*!*/;
DELETE /* TranslateMetadata::set  */ FROM `translate_metadata` WHERE tmd_group = 'page-Wikimedia CEE Spring 2021' AND tmd_key = 'priorityforce'
/*!*/;
# at 1013601001
#210508  7:49:11 server id 171978861  end_log_pos 1013601211 	Query	thread_id=938542235	exec_time=0	error_code=0
SET TIMESTAMP=1620460151/*!*/;
DELETE /* TranslateMetadata::set  */ FROM `translate_metadata` WHERE tmd_group = 'page-Wikimedia CEE Spring 2021' AND tmd_key = 'prioritylangs'
/*!*/;
# at 1013601211
#210508  7:49:11 server id 171978861  end_log_pos 1013601422 	Query	thread_id=938542235	exec_time=0	error_code=0
SET TIMESTAMP=1620460151/*!*/;
DELETE /* TranslateMetadata::set  */ FROM `translate_metadata` WHERE tmd_group = 'page-Wikimedia CEE Spring 2021' AND tmd_key = 'priorityreason'
/*!*/;
# at 1013601422
#210508  7:49:11 server id 171978861  end_log_pos 1013601631 	Query	thread_id=938542235	exec_time=0	error_code=0
SET TIMESTAMP=1620460151/*!*/;
DELETE /* TranslateMetadata::set  */ FROM `translate_metadata` WHERE tmd_group = 'page-Wikimedia CEE Spring 2021' AND tmd_key = 'transclusion'
/*!*/;
# at 1013601631
#210508  7:49:11 server id 171978861  end_log_pos 1013601835 	Query	thread_id=938542235	exec_time=0	error_code=0
SET TIMESTAMP=1620460151/*!*/;
DELETE /* TranslateMetadata::set  */ FROM `translate_metadata` WHERE tmd_group = 'page-Wikimedia CEE Spring 2021' AND tmd_key = 'version'
/*!*/;
# at 1013601835
#210508  7:49:13 server id 171978861  end_log_pos 1013601862 	Xid = 12014340024
COMMIT/*!*/;
# at 1013601862
#210508  7:49:13 server id 171978861  end_log_pos 1013601900 	GTID 171978861-171978861-203041620 trans
/*!100001 SET @@session.gtid_seq_no=203041620*//*!*/;
BEGIN
/*!*/;
# at 1013601900

Which points to deletes happening on May 8, 2021 7:49:11 AM (actual commit took effect 2 seconds later) from the function TranslateMetadata::set

@jcrespo Yep that sounds correct. If that query returns results at some time between those dates, we can rule out my possible cause number 1. If you find details of a DELETE query, I might be able to rule out other possible cases. This table should have very few DELETE queries overall, it's mostly updated with REPLACE (insert if missing) queries.

Please report asap if you will require some kind of data recovery- the more time it passes, the harder it gets. Recovering data from last week: very easy, from last month: easy, from last 3 months: possible, from over 3 months: probably impossible

@jcrespo Thanks a lot for this! It seems we can rule out exploitation of this API as the cause. Unfortunately it points to a bug in our code which we have not yet identified. I imagine it will be hard to selectively restore from backups, and given this metadata is not super essential, it is probably not worth it, but I'll let you know asap if I change my mind.

Patch to fix the exploit in AggregateMessageGroups API that allowed users with translate-manage permission to delete non aggregate message group:

I'd like guidance from the security team how to proceed with disclosing and fixing the security issue. When and how should the fix be applied publicly?

It's low impact, since it requires translate-manage right to be exploited. It is not currently being exploited to my knowledge.

@Nikerabbit - since ext:Translate is production-deployed but not bundled for releases, we can schedule the above patch to be deployed as a security patch, assuming it addresses the issue. We like to deploy these types of patches during the security window on Mondays at 21:00–23:00 UTC, but release engineering is fine with us deploying them at other times if they do not conflict with the train, etc. Once deployed, we can track the issue for the next supplemental release announcement (T279733) and then make this task public and push any relevant backports through gerrit. Let me know if the above patch looks good and we can schedule a date/time for deployment.

Patch looks good to me. There are a few code style issues which can be fixed later.

sbassett moved this task from Watching to Security Patch To Deploy on the Security-Team board.

Patch looks good to me. There are a few code style issues which can be fixed later.

I'll go ahead and deploy this now and keep an eye on logstash for any unexpected errors. If someone with more knowledge of the extension, appropriate rights, etc. could further test the patch in production, and confirm its efficacy, that would be great. Also - updated patch with new subject, bug, author and correct naming convention for production deployment:

The above patch has been deployed to wmf.7. Logstash errors seem fine, in that there do not currently appear to be any obvious, related errors from the patch. Again, if someone with better knowledge of the extension and appropriate rights could perform some additional UAT, that'd be appreciated.

sbassett changed the visibility from "Custom Policy" to "Public (No Login Required)".Jul 1 2021, 10:22 PM
sbassett changed the edit policy from "Custom Policy" to "All Users".

Change 702760 had a related patch set uploaded (by SBassett; author: Abijeet Patro):

[mediawiki/extensions/Translate@master] SECURITY: Enhance validation and logging for AggregateGroups API deletions

https://gerrit.wikimedia.org/r/702760

Change 702718 had a related patch set uploaded (by SBassett; author: Abijeet Patro):

[mediawiki/extensions/Translate@REL1_36] SECURITY: Enhance validation and logging for AggregateGroups API deletions

https://gerrit.wikimedia.org/r/702718

Change 702719 had a related patch set uploaded (by SBassett; author: Abijeet Patro):

[mediawiki/extensions/Translate@REL1_35] SECURITY: Enhance validation and logging for AggregateGroups API deletions

https://gerrit.wikimedia.org/r/702719

Change 702720 had a related patch set uploaded (by SBassett; author: Abijeet Patro):

[mediawiki/extensions/Translate@REL1_31] SECURITY: Enhance validation and logging for AggregateGroups API deletions

https://gerrit.wikimedia.org/r/702720

Change 702760 merged by jenkins-bot:

[mediawiki/extensions/Translate@master] SECURITY: Enhance validation and logging for AggregateGroups API deletions

https://gerrit.wikimedia.org/r/702760

Change 702719 merged by jenkins-bot:

[mediawiki/extensions/Translate@REL1_35] SECURITY: Enhance validation and logging for AggregateGroups API deletions

https://gerrit.wikimedia.org/r/702719

Change 702720 merged by jenkins-bot:

[mediawiki/extensions/Translate@REL1_31] SECURITY: Enhance validation and logging for AggregateGroups API deletions

https://gerrit.wikimedia.org/r/702720

Change 702718 merged by jenkins-bot:

[mediawiki/extensions/Translate@REL1_36] SECURITY: Enhance validation and logging for AggregateGroups API deletions

https://gerrit.wikimedia.org/r/702718

sbassett renamed this task from Aggregategroups Action API module allows deleting translatable page metadata for any group without trace to Aggregategroups Action API module allows deleting translatable page metadata for any group without trace (CVE-2021-36129).Jul 2 2021, 8:02 PM
sbassett moved this task from Watching to Our Part Is Done on the Security-Team board.