Page MenuHomePhabricator

Explore content moderation issues
Closed, ResolvedPublic


Toolhub's design document envisions combining "owner" created data in the form of toolinfo.json data files (or local in-app equivlalent) with "annotations" that users of the website add to augment the owner provided data. There are also many ideas in the roadmap for creating "collections" of tools.

Where there is free form text content, there will be vandalism. This is an Internet truism that Wikimedians are well aware of. MediaWiki includes many components to help support patrolling content submissions. Toolhub will need systems for this as well, but what systems? How can we make the process of patrolling Toolhub feel friendly to folks who are used to doing content patrolling with MediaWiki?

General assumptions

  • All edits will be made by authenticated users. No anon edits will be allowed.
  • Authentication will be tied to Wikimedia OAuth and by extension SUL accounts.
  • Content that can be edited will have various levels of "protection" ranging from any authenticated user can edit to only the original content creator (or an admin) can edit.

Needs identified

  • Everyone
    • View edit history for a toolinfo record (like action=history in MediaWiki)
    • View edit history for an individual editor (like Special:Contributions in MediaWiki)
    • View edit history for all toolinfo records (like Special:RecentChanges in MediaWiki)
    • View audit log for administrative actions, possibly partially redacted depending on action (like Special:Log in MediaWiki)
  • Authenticated users
    • Undo an edit (like action=edit&undo=<revid> in MediaWiki)
    • Revert content to a prior good edit (not sure that there is a MediaWiki exact match for this)
  • Patrollers
    • Mark an edit as reviewed/patrolled (like action=markpatrolled in MediaWiki)
    • Work queue/edit history filter showing edits that have not been reviewed (like the ! marker in Special:RecentChanges)
  • Oversighters
    • Suppress an edit (like Special:RevisionDelete in MediaWiki)
  • Global CheckUsers
    • Request that backend administrators with access to non-public activity log information gather information on IP addresses involved in an edit, or edits made from a range of IP addresses

Related Objects


Event Timeline

bd808 triaged this task as High priority.Aug 21 2020, 8:40 PM

@Keegan, can you help us get discussion started on this question by making a short list of features that you feel are core to a patrolling workflow? I think that the ability to revert an edit, the ability to suppress the content of a reverted edit, the ability to see edit history for a unit of content, and some feed of global edit actions are the core things that come to mind for me. What am I missing? Maybe also interesting to point out things that have been asked for in MediaWiki that are not possible today too just in case we can think of a way to work them in.

You're basically onto it @bd808 , what I would consider the three bundles of tools:

  • Moderation tools–the revisiondeletion suite that serves mainly administrators and oversighters
  • Editing tools–undo and user-created revert tools, editing scripts, talk page templates
  • Documentation tools–page history and logs, user contribution history and logs

Each one of these categories is the root of what the rest of the tooling is built on top of. I'll do some digging to see if I can find some "not possible" things, those requests are usually archived in a Community Tech Wishlist.

Maybe this hasn't been mentioned, but a core part of patrolling workflows is the edit feed in itself: recent changes, new pages, and so on and the tools to filter parts of these. This is related to Keegan's documentation tools but the (near) real-time patrolling feeds are watched in a slightly different way.

I've just started a conversation (well, asked questions at least) here that might be relevant in case people decide they want to spend their time having opinions (going out in Tech News on Monday, limited spread so far):

The archive for the IP masking talk page and the improving tools look at these issues from a very specific point of view, but could be relevant.

Thanks for the links in T261023#6427703 @Johan. I will try to remember to check back in on the Android discussion to see what sorts of things come up.

I do expect Toolhub to only be editable by authenticated users which should avoid the anon IP display and blocking concerns. Toolhub will be using OAuth authentication so edits there will be associated with Wikimedia user accounts (SUL accounts). Having admin'd Wikitech for a number of years I know however that having named accounts does not stop vandals entirely. It does seem to weed out most of the casual vandals though.

Using OAuth for authentication will let us validate that a user has not been blocked on metawiki before publishing an edit. I wonder if we should also try to figure out if the user is blocked on any wiki in the whole farm?

We could think about having a flagged revs style queue of proposed edits that need to be approved by a user with elevated rights as a worst case vandal protection feature. That's potentially a lot of new code to write and test for something that I would hope we only have to turn on in a worst case scenario of abuse however, so I would like to keep it in the "possible, but hopefully not needed" pile at least for now.

I wonder if we should also try to figure out if the user is blocked on any wiki in the whole farm?

One user can be indefinitely blocked on one wiki and a welcome member of the community on another wiki. This makes things a bit complicated.

I wonder if we should also try to figure out if the user is blocked on any wiki in the whole farm?

One user can be indefinitely blocked on one wiki and a welcome member of the community on another wiki. This makes things a bit complicated.

Fair point. Sounds like we should stick to the metawiki blocks (and global which would also block on meta) as our "known bad actor" check.

I didn't see the "Has somebody already looked at this edit?" aspect (i.e. mark as patrolled) in the above list. But maybe that's included in one of the existing bullet points?

I didn't see the "Has somebody already looked at this edit?" aspect (i.e. mark as patrolled) in the above list. But maybe that's included in one of the existing bullet points?

That's one that I think we missed, thank you. :)

FWIW the oversight capacities live in RevisionDelete.

Users will also need the ability to move pages.

Related list of things for a MediaWiki extension to be aware of:

Toolhub is going to be stand alone software rather than a MediaWiki extension, but these notes are useful things to think about. The point about an audit log of moderation actions made there is a good one remember for sure.

@Risker as promised/threatened in our emails, I would love to hear your thoughts on what maybe missing from this list of high level features needed for patrolling edits in a non-MediaWiki application.

@bd808 and I spoke today about some concepts, which I will document here:

*CHECKUSER - I agree that, especially at the early stages of this project, there is probably not the need to include a full-scale checkuser module. The thinking here is that (a) this should be a comparatively low use wiki that requires OAuth for access (and a Wikimedia account for OAuth access) (b) there will be little ability to manipulate content (and no ability to directly modify the tools) and (c) if absolutely required, there will be individuals with sufficient level of access to gather any necessary data and, where applicable, confer with stewards/checkusers from other projects for cross-wiki issues. Instead, it would be more useful to ensure that a hidden table gather checkuser-like data that could be shared with authorized CU/stewards if necessary. This can be reassessed after the project has been operating for a period; if there are issues or problems, it can be reconsidered. I also note that the Wikimedia policy for checkuser requires that access be granted only to those appointed by a local Arbcom or by a community election process, which is an excessive level of complexity for a project just starting out. It's rare for new or low-activity wikis to have checkusers in any case.

*CONTROL OF MODIFICATION OF THE DATABASE - It is understood that most fields will not be able to be modified by "readers" or "users" (i.e., people other than the developer(s) or the project's admin team); however, there is value in having at least one field where those who utilize a tool can make some comment about it (e.g., "especially useful for doing X" or "May also be useful for Wikisource or Wikispecies"). Such comments will assist other readers/users in identifying tools that they want to try out or work with. I think it is a good idea that any publicly viewable user-added comment field include a way to automatically "sign" that comment once saved, so that the commenter is easily identified.

*LOCATION FOR DISCUSSION/NOTICEBOARD/HELP PAGES, ETC. - These are important functions for socialization of the project and development/interaction with both the developer and the user communities. It would probably be better to have these pages on a project that uses typical wiki layout/editing and also has a lot more visibility than Toolhub itself will have. (This will also eliminate the need to develop, manage, and moderate non-tool-related material on Toolhub itself.) I'm inclined to recommend using Meta, probably as a "Toolhub" portal, as it already has a lot of built-in translation abilities, has lots of eyes including the recent changes feed, and is a location that is familiar to a lot of people. Alternatively, Wikitech may be appropriate, but doesn't have nearly the visibility.

*TRANSLATION - Looks like the plan at this point is to make a connection with Translatewiki for Toolhub itself. This will be valuable, and to me would be one of the ways for smaller communities (which are more likely to have interested users with limited "Western" language skills) to really benefit from Toolhub.

*STARTING OFF WITH TARGETED USERS - It is probably a good idea to work with a few focused groups as the first stage in initiating this project. Key selection factors would be (a) an interest in or history of using tools, (b) willingness to participate as "lab rats" to test out the processes and systems, and (c) a recognized use case for how they would make use of Toolhub. Examples could include Wikiprojects that use many tools and would find it helpful to have them all in one place, projects that have specific needs for tools that may already exist on other projects, etc.

*GENERAL COMMENTS - The moderation points mentioned above seem to be a good starting point for this project. I think there's value in setting a target date for re-evaluation of the moderation tools to ensure that (perhaps a year down the line) they remain appropriate.

*POTENTIAL RED FLAGS - When dealing with tools, there are two issues that I always consider red flags: anything that can (intentionally or unintentionally) violate the privacy policy, and anything that can subject a user to (intentional or unintentional) unwanted/excessive attention. The issue that is most likely to trigger a privacy policy problem is a tool that has been (mis)configured to either gather, access or leak non-public information in a way that is out of step with the privacy policy. I recognize that this is unlikely, but I myself have encountered tools that did so in the past, so it's definitely possible. I don't have a good solution for this issue, other than having someone thoroughly test out the tool; this may be an issue of control of what can be included into the database. The less likely way that there could be a privacy policy breach would be by inclusion (by the subject or another user, intentionally or unintentionally) of non-public personal information in one of the few user-editable fields. This will require admins/oversighters to remove the information from public view, and also a way to notify those who have this technical ability of any "edit" that includes such information. (OTRS queues and mailing lists are commonly used by Wikimedia projects for suppression/revision deletion requests, but perhaps there is an easier or more direct way to do so on this smaller project. Perhaps a "front page" that contains key contact info?)

The unwanted/excessive attention issue is a little different. Some of this (e.g., obvious harassment) is going to be addressed by the Universal Code of Conduct when approved. The more likely occurrence will be multiple users notifying a tool developer of the same problem with a particular tool, to the point that the tool developer finds it difficult to respond. It would be useful to find a way to indicate to users that "problem x was reported to Tool Developer on (date)" to reduce the likelihood of getting a dozen or more similar messages. Of course, many users of tools will be interacting with the developer directly without accessing Toolhub at all.

Hope this is helpful.

Thank you @Risker for that feedback. I have incorporated some of it into the description here, and I will be referencing the comment in other places as well.