Page MenuHomePhabricator

RfC: Per namespace view restrictions
Open, HighPublic

Description

  • Affected components: Configuration/Page protection/User management
  • Engineer for initial implementation: TBD.
  • Code steward: TBD.[1]

Motivation

Currently, content that should be restricted from viewing cannot be kept on wiki, unless the entire wikis is restricted from viewing. Allowing users to store more secure information on wiki would

  • Eliminate the need for keeping separate wikis (eg CheckUserWiki)
  • Allow information that should only be visible to certain users to be stored on wiki (eg for WikiJournals)
  • Allow non-WMF wikis to manage information visibility in a more fine-tune manner
  • Be required to convert AbuseFilter filters to wikipages while maintaining visibility restrictions T227595
  • Restricted task T160266
Requirements
  • System administrators must be able to configure read restrictions for individual namespaces (not individual pages, for now)
  • Extensions must be able to specify read restrictions for added namespaces
  • Restrictions must prevent views without the proper rights from viewing the content, history, or logs for a page, including in recent changes, via the api, and in other users' contribution histories

Exploration

(Proposals and considerations will be documented here.)

T230668: Fully implement read/view restrictions in mediawiki core has some background, as does T156788: Improve support for read access restriction / access control
For efficient caching, pages in view-protected namespaces should not transcludable
https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/585322/ is a proof of concept patch that allows such configured read restrictions for namespaces, and applies those restrictions to viewing RecentChanges, viewing Contributions, and attempting to perform actions on an article, including viewing it

[1] Per https://www.mediawiki.org/wiki/Developers/Maintainers

Event Timeline

In T249380#6029976, @Majavah wrote:

Approval of this RfC should not depend on this feature being used in WMF production. That being said, replicas and dumps should be filtered to ensure that only public namespaces are included

Does this have to be in core? There are extensions that provide fine grained view restrictions, including the Lockdown extension I started many years ago.

Is there anything fundamental missing from core?

Does this have to be in core? There are extensions that provide fine grained view restrictions, including the Lockdown extension I started many years ago.

Is there anything fundamental missing from core?

Lockdown doesn't prevent everything; the proof of concept patch I link above restrictions actions (including read), and seeing that the edits occurred in a user's contributions history and in recent changes. With lockdown:

  • Can't see the pages, but can see that the user edited them in user contributions
  • Can't see the pages, but can see that the user edited them in recent changes

(and those were the only two that I have implemented so far in the proof of concept patch)

It would be possible to do this in extensions, but it would require a lot of hooks to be added

Note this may also have some performance issue: if recentchanges have 10 million changes but a viewer can only see 7 of them, viewing Special:RecentChanges may timeout if you want to show all (or first 50) results. see https://secure.phabricator.com/T13274

Alternatively recent changes and contributions may have placeholder entries for restricted pages, but this may have security concern (e.g. when the editing activity of some page is sensitive).

Note this may also have some performance issue: if recentchanges have 10 million changes but a viewer can only see 7 of them, viewing Special:RecentChanges may timeout if you want to show all (or first 50) results. see https://secure.phabricator.com/T13274

Alternatively recent changes and contributions may have placeholder entries for restricted pages, but this may have security concern (e.g. when the editing activity of some page is sensitive).

We already allow users to filter by namespace, this just adds 2 more filters:

  • If user is showing only the selected namespaces, ensure selected don't include namespaces the user cannot see
  • If user is excluding selected namespaces, ensure selected includes namespaces the user cannot see

This is correct, Lockdown only hides the content of pages, not their existence.
Enforcing per-namespace read-restrictions via existing namespace filters is a nice idea!

Once concern with having find grained read restrictions in core is that MediaWiki isn't built for hiding information. Read permission checks are incomplete, content can leak e.g. via transclusion, or search context, or leg entries depending on setup.

For example, if you wanted to hide eistence of pages completely, you would have to prevent a link to that page from turning blue. Which in turn messes with the cacheability of the ParserOutput.

This is why over the years, requests to add this kind of thing to core have been rejected: Attempts to hide the existence of pages are bound to be incomplete, and even hiding the content of pages is not guaranteed to work. I'd personally like to make sure that at least hiding the content works reliably, but subscribing to that guarantee means committing resources for maintaining a system property that WMF doesn't use.

This is correct, Lockdown only hides the content of pages, not their existence.
Enforcing per-namespace read-restrictions via existing namespace filters is a nice idea!

Once concern with having find grained read restrictions in core is that MediaWiki isn't built for hiding information. Read permission checks are incomplete, content can leak e.g. via transclusion, or search context, or leg entries depending on setup.

For example, if you wanted to hide eistence of pages completely, you would have to prevent a link to that page from turning blue. Which in turn messes with the cacheability of the ParserOutput.

The issue with parsing is indeed tricky; I was planning to hide the content, history, actions, etc. but not the existence of pages (eg for OTRS, its fine to everyone to know if a page Ticket:123 exists, but the content and its changes are restricted; same for abuse filters if those become wikipages)

This is why over the years, requests to add this kind of thing to core have been rejected: Attempts to hide the existence of pages are bound to be incomplete, and even hiding the content of pages is not guaranteed to work. I'd personally like to make sure that at least hiding the content works reliably, but subscribing to that guarantee means committing resources for maintaining a system property that WMF doesn't use.

This is to ensure content and history are hidden reliably, but not the existence of the page

The issue with parsing is indeed tricky; I was planning to hide the content, history, actions, etc. but not the existence of pages (eg for OTRS, its fine to everyone to know if a page Ticket:123 exists, but the content and its changes are restricted; same for abuse filters if those become wikipages)

So, the core of the proposal is really to force some namespaces to be filtered out in all places where we support filtering by namespace? IF this was done by introducing some kind of "page filter" or "namespace filter" abstraction, that actually sounds kind of nice to me.

But the problem remains, if we introduce a setting that allows some namespaces to be "hidden", we implicitly give a guarantee that we ensure that the content is actually not accessible. And subscribing to that guarantee is potentially expensive.

I once proposed to create a new content model for restricted page, so that any existing extensions and other mechanisms may not understand the content without using the special ContentHandler, which will handle read permission check. And for extensions directly accessing the text table, we may encrypt the content in the text table.

(for AbuseFilter-specific idea, see T227595#5407083)

BTW: link tables may expose something from restricted pages.

@daniel based on the comment in T227595#5347067 there appears to be some work that needs to be done in core just to support extensions being able to properly control read access.

@daniel based on the comment in T227595#5347067 there appears to be some work that needs to be done in core just to support extensions being able to properly control read access.

There currently is no plan to invest into making fine grained read restrictions in MediaWiki work reliably and securely. It's just not something that MediaWiki was designed to do, and to my knowledge, it's not something the WMF is ready to invest in.

Akuckartz raised the priority of this task from Medium to High.Jun 15 2020, 7:00 PM