Page MenuHomePhabricator

Separate bot right for normal pages and interface (MediaWiki:) pages
Open, MediumPublic

Description

Having the ability to edit js and not have edits show up in RC by default is super scary.

We should perhaps make bot right not apply to NS_MEDIAWIKI (so any edit to that namespace shows up on recentchanges by default), and have a new right boteditinterface if anyone truly needs it, with the intention that it is very difficult to get the new right on Wikimedia wikis.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Note that MediaWiki:Gadget-markAdmins-data.js on commons is currently being updated by a bot with editinterface rights

and not have edits in RC

Note the edits are still in RC, just filtered from the default view. They can be seen by clicking a link or adding &hidebots=0 to the URL.

Bawolff changed the visibility from "Custom Policy" to "Public (No Login Required)".Nov 23 2016, 8:40 PM
Bawolff changed Security from Software security bug to None.

[I unmarked this as secret security, as this doesn't need to be secret, and if we do this we'll need to solicit user feedback anyway]

So the list of bot edits in NS_MEDIAWIKI over the last 30 days is:

commonswiki count	rc_user_text
7	CommonsMaintenanceBot

fawiki count	rc_user_text
17	Dexbot

hewikisource count	rc_user_text
50	OpenLawBot

metawiki count	rc_user_text
181	AddisWang
205	BHouse (WMF)
22	EWilfong (WMF)
674	Gabriel Birke (WMDE)
3	Ilario
2	JRobell (WMF)
33	JSutherland (WMF)
64	JTrost (WMF)
709	Kai Nissen (WMDE)
164	Kaldari
11	MarcoAurelio
2	MeganHernandez (WMF)
216	Pcoombe (WMF)
1112	RStearns (WMF)
2	Romaine
626	SPatton (WMF)
2	Samat
169	Seddon (WMF)
18	Shizhao
171	Steinsplitter
3	Stryn
10	TGutowski (WMF)
210	TSkaff (WMF)

testwiki count	rc_user_text
4	Gabriel Birke (WMDE)
8	Kai Nissen (WMDE)

viwiki count	rc_user_text
85	Tuanminh01


wikidatawiki count	rc_user_text
1	EmausBot

zhwiki count	rc_user_text
78	Liangent-adminbot

The excessive number of examples on metawiki seem centralnotice related.

So this proposal would only apply to normal bot rights. Extensions that set EDIT_FORCE_BOT (like CentralNotice) will be unaffected.

Summary of the bot edits to NS_MEDIAWIKI

  • Stuff on meta: CentralNotice (inapplicable to this change)
  • commons - keeping MediaWiki:Gadget-markAdmins-data.js up to date
  • hewikisource OpenLawBot - Making editnotice pages
  • viwiki - spam blacklist updates
  • zhwiki - adding variant translations + some other things I don't understand
  • fawiki - deleting old user scripts
  • wikidatawiki - seems to be a false positive from my script
  • testwiki - central notice testing (so inapplicable to this change)

Having the ability to edit js and not have edits show up in RC by default is super scary.

Can you elaborate on why you think this is scary?

Having the ability to edit js and not have edits show up in RC by default is super scary.

Can you elaborate on why you think this is scary?

It makes detection of malicious activity (for example from a compromised adminbot account) much less likely, especially in the case of a less sophisticated attacker. Arguably a sophisticated attacker might insert scripts to further hide their activity, but there are also plenty of non-on-wiki based rc feeds that people look at, as well as people with js disabled, etc.

More generally, i think any edit to site js should be scrutinized carefully and the idea that edits to site js could be hidden from default settings of review tools is what i find scary.

So are you just concerned about JS (and probably CSS) or the entire NS_MEDIAWIKI namespace?

I suppose the easiest solution here is just to not allow the bot flag to work in NS_MEDIAWIKI.

I think both NS_MEDIAWIKI and also non-NS_MW JS/CSS/JSON pages would be worth capturing in this.

So are you just concerned about JS (and probably CSS) or the entire NS_MEDIAWIKI namespace?

There isn't really any way to tell which (non-JS) NS_MEDIAWIKI pages are script injection vectors; it depends on how the message gets escaped.

I thought these were already separated? When creating a BotPassword, as a user with the editinterface right, one has to explicitly enable "editinterface" for it to be available for the bot login. Same applies to OAuth as well.

Last I checked, this was working. I ran into it myself when creating a BotPassword for tourbot. Without granting that right, it was not able to make MediaWiki-namespace edits.

However, maybe this isn't working as expected?

I thought these were already separated? When creating a BotPassword, as a user with the editinterface right, one has to explicitly enable "editinterface" for it to be available for the bot login. Same applies to OAuth as well.

That's correct. This task is about splitting the bot right (the right to hide one's edits in change lists) into separate rights for normal and editinterface edits.

@Krinkle BotPassword restrictions in general are working (for example I didn't enable rollback on my personal account and couldn't use utilities like Huggle that checked for it until I did).

Our standard project settings don't give bots editinterface access - this has to be specifically added, project policies should be able to resolve your concerns - and if someone won't follow policy they don't get access.

I think both NS_MEDIAWIKI and also non-NS_MW JS/CSS/JSON pages would be worth capturing in this.

I assume you mean user js (are you aware of any other types of js pages?) . That is a good point. I hadnt thought of those. These seem more likely to have an annoying amount of bot traffic, but maybe less now that gadgets and meta:special:mypage/global.js is a thing. Ill have to investigate the impact of doing that.

So are you just concerned about JS (and probably CSS) or the entire NS_MEDIAWIKI namespace?

There isn't really any way to tell which (non-JS) NS_MEDIAWIKI pages are script injection vectors; it depends on how the message gets escaped.

Indeed. If i had a way to auto determine if messages are scripts/raw html, I would love to do just that. Unfortunately the system messages are a mess, especially when you include extension javascript message usage.

Bawolff renamed this task from Separate bot and editinterface rights to Separate bot right for normal pages and interface (MediaWiki:) pages.Nov 24 2016, 5:39 AM
Bawolff updated the task description. (Show Details)

Our standard project settings don't give bots editinterface access - this has to be specifically added, project policies should be able to resolve your concerns - and if someone won't follow policy they don't get access.

My concern is about malicious actors and trying to reduce damage in a compromise situation (by decreasing time to discovery). Project policies cannot solve that problem because the goal here is to make it easier to catch people who arent following policies.

I suppose the easiest solution here is just to not allow the bot flag to work in NS_MEDIAWIKI.

I was also going to suggest this. The proposal sounds reasonable, but it seems to me that introducing another user right (which we intend to never give to anyone) is more complicated than it needs to be.

I was thinking the new right would primarily be for third party users

I think both NS_MEDIAWIKI and also non-NS_MW JS/CSS/JSON pages would be worth capturing in this.

I assume you mean user js (are you aware of any other types of js pages?) . That is a good point. I hadnt thought of those. These seem more likely to have an annoying amount of bot traffic, but maybe less now that gadgets and meta:special:mypage/global.js is a thing. Ill have to investigate the impact of doing that.

There is (or will be) the Gadget: namespace, for one.

Yeah, but its not here yet, and people have been sayingthat for years now. We can add that when it actually happens.

Perhaps instead of "don't allow bot-tagged edits in the MediaWiki
namespace" it should be "add a configuration variable for namespaces where
the bot flag is/isn't allowed" to allow the Gadget namespace to be easily
handled when the time comes. That might also fix the third-party concern
without a new right.

Indeed. If i had a way to auto determine if messages are scripts/raw html, I would love to do just that. Unfortunately the system messages are a mess, especially when you include extension javascript message usage.

See https://gerrit.wikimedia.org/r/#/c/203299/ and T85864: Special pages, actions and views whose messages don't escape text. That goal is not unreachable.

Indeed. If i had a way to auto determine if messages are scripts/raw html, I would love to do just that. Unfortunately the system messages are a mess, especially when you include extension javascript message usage.

See https://gerrit.wikimedia.org/r/#/c/203299/ and T85864: Special pages, actions and views whose messages don't escape text. That goal is not unreachable.

Perhaps not unreachable, but we aren't anywhere close to being there yet, especially after taking extensions into account. In particular, escaping of messages seems particularly lacking in client side coding on older extensions.

I've notified everyone who operates bots that have edited NS_MEDIAWIKI in the last 30 days.

I think both NS_MEDIAWIKI and also non-NS_MW JS/CSS/JSON pages would be worth capturing in this.

I assume you mean user js (are you aware of any other types of js pages?) . That is a good point. I hadnt thought of those. These seem more likely to have an annoying amount of bot traffic, but maybe less now that gadgets and meta:special:mypage/global.js is a thing. Ill have to investigate the impact of doing that.

So, if we only care about bots editing other user's JS files (but not there own), this is definitely do-able with minimal disruption:

for i in `cat all.dblist`; do echo 'select count(*) ' \'$i count\'', rc_user_text from recentchanges where rc_user_text != "MediaWiki default" and rc_bot = 1 and rc_namespace = 2 and rc_type != 5 and rc_title not like concat( rc_user_text, "/%") and (rc_title like "%.js" OR rc_title like "%.css") group by rc_user_text ;' | sql $i>> tmp3 ; done
commonswiki count       rc_user_text
2       Cyberbot I
enwiki count    rc_user_text 
7       Amalthea (bot) 
10      Cyberbot I

EDIT: These were false positives due to underscore vs space in username

If we also include bots that edit their own user js/css (presumably bot js that's automatically being edited is getting included in other js files), we get a coiple more entries

bawolff@tools-bastion-03:~$ for i in `cat all.dblist`; do echo 'select count(*) ' \'$i count\'', rc_user_text from recentchanges where rc_user_text != "MediaWiki default" and rc_bot = 1 and rc_namespace = 2 and rc_type != 5  and (rc_title like "%.js" OR rc_title like "%.css") group by rc_user_text ;' | sql $i>> tmp2 ; done
commonswiki count	rc_user_text
2	Cyberbot I
enwiki count	rc_user_text
7	Amalthea (bot)
7	COIBot
10	Cyberbot I
17	DatBot
eswiki count	rc_user_text
2	MetroBot
fiwiki count	rc_user_text
113	Fiwiki-tools-bot
frwiki count	rc_user_text
4	Framabot
itwikisource count	rc_user_text
1	Alebot
metawiki count	rc_user_text
1	MABot
ruwiki count	rc_user_text
45	Dibot
ruwikisource count	rc_user_text
2	BotLegger

That's still not a huge amount. We might further limit it to not include user css, since afaik its very hard to get an xss solely with css in modern browsers (Although various privacy leaks are possible).

Following @Bawolff notice, I'm including brief info on the [[s:he:user:OpenLawBot | OpenLawBot]] in hewikisource.

Summary of the bot edits to NS_MEDIAWIKI

  • hewikisource OpenLawBot - Making editnotice pages

The OpenLawBot is the the background process of the [[s:he:ספר החוקים של מדינת ישראל | Israeli Book of Laws' project]] in the Hebrew Wikisource. It provides the legal textbook of the State of Israel (laws, ordinances, regulations, decrees, etc.), in a readable manner, with interlinks between the legal documents. Each OpenLaw page is composed of two pages – the source text, stored at NS:116 (see T66353), and the formatted wiki text, stored and displayed at the main NS. The source page contains the actual legal text in a simplistic format, very easy to understand and edit. The wiki page contains a formated text derives from the source text, after sophisticated automatic editing. It is very complex due to the usage of templates for the visual presentation. The bot is making editnotices for each new article in order to prevent new users from trying to update the wiki (formatted) part of the text. The editnotice directs the user to the source text.

In theory, article-specific editnotices should reside in the same NS of their parent article. However, in practice, the editnotice mechanism was badly designed, and requires a constant manipulation of the NS:Mediawiki. I am not aware of any alternative "mass-editnotice" solution.

In theory, article-specific editnotices should reside in the same NS of their parent article. However, in practice, the editnotice mechanism was badly designed, and requires a constant manipulation of the NS:Mediawiki. I am not aware of any alternative "mass-editnotice" solution.

You can make the MW: editnotice include arbitrary non-MW: pages. See how enwiki does it, for example: Wikipedia:Editnotice#Technical_details.

@Fuzzy, you probably can solve it by adding on MediaWiki:Editnotice-0 a code like {{#ifexist:מקור:{{PAGENAME}}|Don't edit this, but [[מקור:{{PAGENAME}}]]! |}}

@Fuzzy, you probably can solve it by adding on MediaWiki:Editnotice-0 a code like {{#ifexist:מקור:{{PAGENAME}}|Don't edit this, but [[מקור:{{PAGENAME}}]]! |}}

Nice, I should have thought about it... :) 10nx!

In theory, article-specific editnotices should reside in the same NS of their parent article. However, in practice, the editnotice mechanism was badly designed, and requires a constant manipulation of the NS:Mediawiki.

Edit notices reside in the MediaWiki namespace because all other user interface text lives in the MediaWiki namespace. As others have noted, it's trivial to load edit notices from other namespaces if desired or to use simple logic with namespace-level edit notices.

It's amusing that you call edit notices badly designed when what you describe—a two-page system of source text and formatted text using complex templates that requires manually pointing users to the correct edit location—sounds pretty rough.

Edit notices reside in the MediaWiki namespace because all other user interface text lives in the MediaWiki namespace.

Editnotices are the edit-page counterparts of notice templates. A nicer solution is to implement editnotices via template inclusion. It's not possible with current design.

It's amusing that you call edit notices badly designed when what you describe—a two-page system of source text and formatted text using complex templates that requires manually pointing users to the correct edit location—sounds pretty rough.

Yep, I can agree on that. We pushed the mediawiki system beyond its limits. Subproject-parsers is not something mediawiki will support.

Huji subscribed.

I stumbled upon this bug through a message posted on one of WMF wikis, and I just wanted to stop by and say I support this idea.

So thinking about user JS/css. I think the risk of a bot editing its own JS/CSS is significantly less (If a bot's user js/css is included elsewhere, its probably expected that the bot will be editing these pages). So I think for now what we should do is make the bot flag not work on User:Foo/bar.js (or .css) as long as the bot name is nor Foo, and anywhere in NS_MEDIAWIKI.

I thought these were already separated? When creating a BotPassword, as a user with the editinterface right, one has to explicitly enable "editinterface" for it to be available for the bot login. Same applies to OAuth as well.

That's correct. This task is about splitting the bot right (the right to hide one's edits in change lists) into separate rights for normal and editinterface edits.

I'm sorry, but then I'm not sure I see the benefit of separating these rights. I don't deny that in theory it could help in some edge case, but I'm trying to understand where this need is coming from and what we expect to solve or avoid in practice.

If a bot is granted rights like editinterface, edituserjs, editusercss, editmyuserjs, or editmyusercss, then whether or not those edits are hidden on recent changes seems of secondary concern. A given bot could trivially hide the edits from view using a variety of CSS or JS methods.

If we're not concerned with malicious intent, but rather general visibility of the edits - then why grant such an account the bot right in the first place? I've seen requests for bot right (on English Wikipedia and Commons) resulting in agreement for the bot to be created and run, but without the bot flag granted because it was desirable for the edits to be seen in Recent Changes by default and/or be patrollable. This seems like a very natural and adequate solution to the problem presented in this task.

If we assume consensus here about not being able to use the bot flag for those edits, then it seems at odds to grant such an account the bot flag in the first place.

If only some wikis want to allow bot-flagged edits on CSS/JS pages, then still it seems like granting the bot right would already allow them to do so.

If within a single wiki users want a specific bot to be able to both 1) make bot-flagged regular edits and 2) non-bot-flagged edits to css/js pages - that is possible by asking the bot operator to set bot=0 on those edits. However, a more secure solution would be to separate those accounts (one with bot, another with editinterface).

Not only can such a measure be worked around, as Krinkle described, but I suspect it would encourage further "leaking" of sensitive JS into other namespaces, just to ensure the edits don't showup in RC. On smaller wikis with low manpower bot edits that appear in the RC are frowned upon (because they seem to "steal" the attention from vandals) and risk raising useless frictions.

So unless there is a very good usecase for which it shoud be enforced, I don't see the point of the change.

If a bot is granted rights like editinterface, edituserjs, editusercss, editmyuserjs, or editmyusercss, then whether or not those edits are hidden on recent changes seems of secondary concern. A given bot could trivially hide the edits from view using a variety of CSS or JS methods.

With editmy* probably not from most people, and even with the less limited ones it's very unlikely they could pull it off. People look at change lists on multiple skins (including the mobile skin), non-HTML-based patrol tools, RSS, IRC... in any case it's defense in depth. Granted, not very strong defense, but still more than nothing.

If we're not concerned with malicious intent, but rather general visibility of the edits - then why grant such an account the bot right in the first place?

What wiki functionaries can do and what developers can do are distinct issues. It's probably not a good idea to rely exclusively on the first.

If within a single wiki users want a specific bot to be able to both 1) make bot-flagged regular edits and 2) non-bot-flagged edits to css/js pages - that is possible by asking the bot operator to set bot=0 on those edits. However, a more secure solution would be to separate those accounts (one with bot, another with editinterface).

They can use two separate bot passwords in that case. Right now the bot right is bundled in the high-volume grant, but we can separate that if there is a use case.

Right now the bot right is bundled in the high-volume grant, but we can separate that if there is a use case.

IMO that would be rather silly, since the bot and apihighlimits rights are largely the point of that grant.

Change 336390 had a related patch set uploaded (by Brian Wolff):
Do not allow bot right to be applied to NS_MEDIAWIKI or user JS/CSS

https://gerrit.wikimedia.org/r/336390

I see the points of both parts here. If this goes forward I'd like a User-notice and notes sent to the pywiki and labs mailing lists for botops awareness. Besides, I'd also appreciate review on T152296: Review the 'botadmin' group at mlwiktionary and mlwikisource and check if it is a good idea to duplicate all admin permissions for a bot group. Obviously if the user is granted bot+sysop it'll have them all though.

FWIW

hive (wmf_raw)> select u.wiki_db, u.user_name from mediawiki_user u join mediawiki_user_groups g1 on u.wiki_db = g1.wiki_db and u.user_id=g1.ug_user and g1.ug_group = 'bot' join mediawiki_user_groups g2 on u.wiki_db = g2.wiki_db and u.user_id = g2.ug_user and g2.ug_group = 'interface-admin' where u.snapshot='2018-11' and g1.snapshot=u.snapshot and g2.snapshot=u.snapshot;
...
enwiki	MusikBot II
frwikibooks	JackBot
thwiki	^Nullzerobot
commonswiki	CommonsMaintenanceBot
huwikibooks	Tacsipacsi

so we are talking about a fairly small number of bots. Maybe just prevent users from having both bot and interface-admin?

On enwiki, we just purposefully granted an account this combination, I suspect the other projects have as well.

I don't think anything needs to be changed here at all and this should be declined, a far second choice would be to have a configuration option that can be used to not honor requests for 'bot' tags on these pages. (Fail silently as to not break bots as well).

Change 336390 abandoned by Brian Wolff:
Do not allow bot right to be applied to NS_MEDIAWIKI or user JS/CSS

Reason:
Clearly controversial, and I think it would need more discussion/consensus before implementation.

https://gerrit.wikimedia.org/r/336390