[Check] Offer volunteers feedback about the reliability of the source they are attempting to add
Open, Needs TriagePublic
Actions

Assigned To

None

Authored By

	ppelberg
	Mar 8 2021, 7:50 PM

Description

This is a meta-task to cover the work involved with equipping newcomers with the information and tools they need to decide how reliable experienced volunteers are likely to perceive the sourcing they're considering adding.

Background

As noted in T265163, inexperienced editors often make edits that defy the project they are editing's policies and guidelines. One such policy we see new editors break (knowingly and unknowingly) is not citing reliable sources. [i]

This task is about equipping volunteers with the information and actions they need to decide if and how they will proceed with citing the source they are attempting/considering adding to Wikipedia.

Ultimately, this fits within the larger effort to help newcomers make edits they are proud of and experienced volunteers consider useful and aligned with Wikipedia's larger objectives.

Components

Consider these components to be an evolving list...

Component	Description	Ticket(s)
0.	Introduce the concept of reliable sources to people adding a reference to Wikipedia for the first time	T350322
1.	A way for volunteers, on a per project basis, to define, in a machine-readable way, what sources they reached consensus on being reliable and unreliable.	T337431
2.	A way for volunteers to add to and edit the "list" described in "1."	T330112
3.	A way for the editing interface to check a source someone is attempting to add "against" the "list" described in "1."	T349261
4.	A way to make the person editing aware, in real-time, when they have added a source that defies the project's policies	T347531
5.	Information that helps newcomers decide how likely experienced volunteers are to perceive the source they're adding as reliable in this specific context	T350319
6.	A way for volunteers to audit/see the edits reference reliability feedback is shown within	T350622
7.	A way for volunteers to report issues with the reference reliability feedback they see Edit Check providing	T343168
8.	A way for experienced volunteers to define what message people are presented with when attempting to cite a source a project has developed a reliability-related consensus around	T337431

Links

Research

Research:ReferenceRisk via @FNavas-foundation
Gender and country biases in Wikipedia citations to scholarly publications via @Pablo
WikiCite: Editing Awareness and Trust Experiments via @cmadeo
Research:Analyzing sources on Wikipedia via @Isaac.
Sources investigation and PoC via @FNavas-foundation
Longitudinal Assessment of Reference Quality on Wikipedia via @Pablo
- The % of English Wikipedia sentences missing a citation has dropped by 20% in the last decade, with more than half of verifiable statements now accompanying references. The % of non-authoritative sources has remained below 1% over the years as a result of community efforts.
- Editors with more experience tend to make better changes in terms of reference quality by adding missing references and removing potentially risky sources.
- New editors who have co-edited an article with an experienced editor on the same day are more likely to avoid risky references in future edits compared to their peers who have not.
Multilingual Wikipedia perennial source list (preprint avail. soon) via @Pablo
- Some sources deemed untrustworthy in one language continue to appear in articles across other languages.
- Non-authoritative sources found in the English version of a page tend to persist in other language versions of that page.
Wikipedia Source Controversiality Metrics via @Pablo
- First findings suggest that statements backed with reliable sources tend to receive less edits than those with unreliable sources (case study: enwiki articles on climate change; reliability taken from the perennial sources list)

Relevant conversations

MediaWiki
- @Sdkb writes in "So glad to see!": "... we'll absolutely want control over the source list there, so that we can modify it as RSP changes, and ideally we'll want to be able to provide context/specific conditions for (sometimes-)unreliable sources, as it's far from just a binary reliable/unreliable switch."
English Wikipedia
- Easier_new_articles_suggestion_for_non-contributors
  - "With this new proposed UI we could have a clear summary of the criteria for notability and eligibility on Wikipedia. The current requested articles system does not explicitly put those critera on the page, instead users need to navigate elsewhere to find them." -- [https://en.wikipedia.org/wiki/User:Elli](User:Elli)
  - "... requested articles should require RS for example." -- [https://en.wikipedia.org/wiki/User:Elli](User:Elli)
  - " If we do require RS, we would still need to the community to somehow vet the sources." -- User:Philipp.governale
  - " We could even attach an edit filter that barred it if there weren't at least 2 URLs in it." --User:Nosebagbear

Tools

WikiScore: a tool created to validate edits and count scores of participants in wikicontests. via @SEgt-WMF.
CredBot via @ FNavas-foundation.
User:Headbomb/unreliable via @Pablo
- The script breaks down external links (including DOIs) to various sources in different 'severities' of unreliability.
- See: User:Headbomb/unreliable#What_it_does.
User:SuperHamster/CiteUnseen via @Pablo
- Adds categorical icons to Wikipedia citations, providing readers and editors a quick initial evaluation of citations at a glance
User:Novem_Linguae/Scripts/CiteHighlighter via @Pablo
- Highlights 1800 sources green, yellow, or red depending on their reliability
Credibility bot

Source lists
Listed at https://www.wikidata.org/wiki/Q59821108

i. https://en.wikipedia.org/w/index.php?title=User_talk:ENieves1&type=revision&diff=1009512219&oldid=1009437193&diffmode=source

Related Objects
Search...

Status	Subtype	Assigned	Task
Open		None	T265163 Create a system to encode best practices into editing experiences
Open		None	T276857 [Check] Offer volunteers feedback about the reliability of the source they are attempting to add
Resolved	Spike	nayoub	T325414 [SPIKE] Generate design concepts to help newcomers assess the reliability of a source they are attempting to add
Open		None	T345118 [SPIKE] Investigate what capabilities are need to support a reference reliability check
Open		None	T345217 Create Special page to manage lists of URLs
Open		None	T346849 Introduce a machine-readable format for storing source reliability consensus
Open		None	T348060 Expand the Reference Reliability check to include a wider range of sources
Open		None	T350319 Equip people with information they can use to help assess a source's reliability
Open		None	T95390 Citoid's behavior with Wikipedia links is unhelpful (maybe broken)
Open		None	T351779 [MILESTONE] Offer Reference Reliability Check at partner wikis
Resolved		nayoub	T347531 Define user experience for prompting people to review and replace unreliable sources
Open		None	T350317 [SPIKE] What edge cases will the reference reliability experience need to consider?
Resolved		ppelberg	T352119 Determine what facet(s) of the reference reliability check will be configurable on-wiki
Open		Esanders	T349261 Introduce an API to enable editing interfaces to offer feedback about reference reliability
Open		None	T350322 Design Citoid first-run experience
Open		MNeisler	T346981 Establish baseline for the reliability of references people accompany new content with
Resolved		MNeisler	T346982 Generate a list of references people cite when adding new content
Resolved		ppelberg	T346983 [SPIKE] Identify approaches for assessing the reliability of a source
Open		None	T350314 [CONSULTATION] Learn with volunteers think about the reference reliability design directions
Open		None	T352123 Conduct usability testing of reference reliability check user experience (desktop + mobile)
Open		None	T352125 [SPIKE] Ensure volunteers maintain easy access to/visibility of blocked domains people attempt to publish
Resolved		MNeisler	T352133 Instrument the reference reliability user experience
Resolved		Ryasmeen	T352134 Build reference reliability check MVP (mobile + desktop)
Resolved		Esanders	T354846 Citoid: Improve copy in 'auto' panel
Resolved		Esanders	T354847 Citoid: Move 'insert' button to inspector header when only one result is presented
Open		None	T354848 Citoid: Improve onboarding UI
Resolved		DLynch	T348805 Create dashboard for viewing edits Edit Check was activated within
Open		None	T360492 Updated feedback to include name of the source people are attempting to add
Open		None	T360555 Make volunteers aware of Reference Reliability deployment
Open		Trizek-WMF	T360493 Enable people to learn more about why the source they are attempting to cite is blocked
Open		None	T350622 Introduce a new tag to identify edits the Reference Reliability check is shown within
Open		None	T360561 When Reference Reliability check is shown, ensure this event is logged

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Thank you @Trizek-WMF and @Whatamidoing-WMF for the links you shared during today's Monday morning meeting. I've added them to the task description'sLinks` section.

ppelberg added a subscriber: Samwalton9-WMF.Mar 8 2021, 7:52 PM

ppelberg moved this task from To Triage to Triaged on the VisualEditor board.Mar 8 2021, 8:16 PM

ppelberg mentioned this in T274520: Move Growth configuration to on-wiki JSON file.Mar 8 2021, 8:20 PM

Trizek-WMF updated the task description. (Show Details)Mar 9 2021, 1:33 PM

A more fundamental learning might be that citations are needed at all. We could also consider alerting editors when they add new content but don't add a citation. The research team developed a Citation Needed model that could do the heavy lifting on understanding whether a citation is needed for a given piece of text.

Communities already have two tools to block some links: the local blocklist or AbuseFilter. At the moment, it is not possible to know if your link will be blocked before you hit "publish", and, when your edit is blocked, the faulty link is not highlighted.

The spam blacklist tells you which link caused your edit to be prevented, but doesn't show you where it is in the article, and it's buried under various other text:

• iamjessklein awarded a token.Mar 9 2021, 1:58 PM

ppelberg updated the task description. (Show Details)Mar 9 2021, 8:47 PM

ppelberg mentioned this in T265163: Create a system to encode best practices into editing experiences.Apr 2 2021, 10:41 PM

Task description udpate

ADDED links to, and quotes from, this on-wiki conversation (Easier_new_articles_suggestion_for_non-contributors) via @MMiller_WMF

In T276857#6896276, @Samwalton9 wrote:

A more fundamental learning might be that citations are needed at all. We could also consider alerting editors when they add new content but don't add a citation.

Great spot and agreed.

The research team developed a Citation Needed model that could do the heavy lifting on understanding whether a citation is needed for a given piece of text.

@Samwalton9 this is the first time I'm hearing of this...can you confirm this is the research you were referring to https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statements ?

In T276857#6969784, @ppelberg wrote:

@Samwalton9 this is the first time I'm hearing of this...can you confirm this is the research you were referring to https://meta.wikimedia.org/wiki/Research:Identification_of_Unsourced_Statements ?

That's the one :)

Regarding the machine-readable storing of reliable/unreliable classifications, I have a couple of thoughts. First, @Newslinger has been working on a tool to take the English Wikipedia's table and turn it into a more usable format - looks like you can read more about that here.

Second, I've been feeling cautious about the idea of telling users explicitly what is or isn't reliable as they edit. On the one hand it seems like an obvious thing to do - the community already has this table with encoded community conventions which we could make available to new users more readily. On the other hand we would risk strengthening Wikipedia projects' biases around sourcing. We might want to make sure we design such a feature in a way that doesn't actively discourage adding sources which aren't in the list yet (i.e. don't train editors to look for a sign of approval that a source is definitely reliable). We already have a problem with editors not understanding what sources are reliable in different languages/countries/contexts (see an effort to alleviate this issue at Wikipedia:New page patrol source guide. Maybe this is an inherent tension with attempting to codify fuzzy norms and practices.

Thanks for the ping, @Samwalton9.

Here is an example of what the machine-readable data for the English Wikipedia's perennial sources list looks like in JSON form:

https://api.sourceror.org/v1/all_entries

The data is scraped and parsed from the perennial sources list. This format can be adapted for equivalent source lists on other Wikipedias. The Wikidata entry links to several non-English lists, two of which can be parsed to a machine-readable format:

On the English Wikipedia, the AbuseFilter and blocklist features are able to handle some use cases for this data (deprecation and blacklisting, respectively). However, there are limitations that reduce the effectiveness and hinder community acceptance of these technical measures:

As of February 2021, the Wikipedia apps for Android and iOS are not able to display edit filters, according to the table in this noticeboard discussion.
There is currently no way to apply blocklist entries to a selection of pages. All patterns on the blocklist apply to all pages on the project.
Neither the AbuseFilter nor the blocklist provides a simple way to target or ignore content additions to particular sections of a page.
As Samwalton9 mentioned, the messaging associated with these technical measures could be improved. The community can handle some of this, but it would be helpful to have more data available that could be incorporated in the template messages. For example, when a user adds a link that is either deprecated or blacklisted, the error or warning message should ideally show the paragraph surrounding the link.

Peter and I have been discussing using the Spam blacklist as a starting point for this, as entries there are more easily categorisable as obviously undesirable, whereas the perennial sources list has many entries with edge cases and nuances and is as-yet a few steps further removed from something the editor could easily parse (especially so because it doesn't exist on all Wikis). I did some investigation today looking through the English Wikipedia blacklist log and found that entries can be broadly categorised into the following (often overlapping) buckets: Spam, URL shorteners, and unreliable sources. The following is some notes on how ~200 randomly selected log entries broke down:

Spam (40%): These were entries clearly designed to lead readers to some website selling a product, hosting a suspicious file, or otherwise of no encyclopedic value whatsoever.
URL shorteners (35%): These are entries which introduced links to websites like bit.ly, youtu.be, or Google Amp. These are on the Spam Blacklist because they can disguise their destination, but I was surprised at the volume of hits this category receives. It's worth pointing out that many of these links might have been to spam sites, but I'm sure many were good faith edits.
Unreliable sources (25%): These links appeared like they could be useful references for articles. While I'm sure many have been spammed or aren't even remotely reliable, I could imagine most of these link additions having been made in good faith by a new user.

I'm posting this here because I think this backs up the idea of the spam blacklist being a sensible place to start - if 90%+ of hits were clearly from spam bots I might have suggested another approach, but as much as 60% of spam blacklist hits are at least potentially being made in good faith and many more entries on the blacklist are to unreliable sources than I had previously thought.

We could imagine three lanes of guidance based on this categorisation, explaining that the user's edit won't successfully be saved, and then providing guidance to move away from a URL shortener, use a more reliable source, or to check that a link isn't complete garbage. This is prompting me to think about how we could facilitate categorisation on the Spam blacklist to match entries to these specific guidance paths; I'm not sure how that would work right now.

Samwalton9-WMF updated the task description. (Show Details)Apr 26 2021, 10:39 AM

Enterprisey subscribed.Oct 17 2021, 5:27 AM

VPuffetMichel added a project: EditCheck.Jan 18 2023, 3:59 PM

ppelberg updated the task description. (Show Details)Feb 11 2023, 12:07 AM

ppelberg added a subscriber: Sdkb.

Lectrician1 added a project: Community-Wishlist-Survey-2023.Feb 17 2023, 6:33 PM

ppelberg updated the task description. (Show Details)Mar 6 2023, 10:05 PM

ppelberg updated the task description. (Show Details)

ppelberg added a subscriber: nayoub.

matmarex subscribed.May 2 2023, 4:22 PM

ppelberg updated the task description. (Show Details)Jun 16 2023, 12:21 AM

Trizek-WMF updated the task description. (Show Details)Jun 16 2023, 11:26 AM

ppelberg updated the task description. (Show Details)Jul 20 2023, 12:22 AM

ppelberg updated the task description. (Show Details)

ppelberg added subscribers: Pablo, Isaac.

ppelberg added a subtask: T325414: [SPIKE] Generate design concepts to help newcomers assess the reliability of a source they are attempting to add.Jul 20 2023, 12:30 AM

ppelberg updated the task description. (Show Details)Aug 7 2023, 11:02 PM

ppelberg added a subscriber: FNavas-foundation.

Ladsgroup subscribed.Aug 8 2023, 8:30 PM

ppelberg updated the task description. (Show Details)Aug 10 2023, 4:10 PM

ppelberg added a subscriber: cmadeo.

ppelberg updated the task description. (Show Details)Aug 12 2023, 12:16 AM

ppelberg updated the task description. (Show Details)

Eragon_Shadeslayer subscribed.Aug 12 2023, 10:15 PM

ppelberg updated the task description. (Show Details)Aug 29 2023, 7:10 PM

ppelberg updated the task description. (Show Details)

It would be nice to have the "Perennial sources" list under the community config page so it can be turned off on the small and medium sized wikis, most of which do not have one.

The only global rule I am aware of is that on the blacklist are blog-sites, sites registered in whois with the same name as the article (especially websites of the company of an article) and of course the local and global spam blacklist.

MusikAnimal mentioned this in T345217: Create Special page to manage lists of URLs.Aug 29 2023, 11:50 PM

ppelberg mentioned this in T343173: What messages do newcomers editing from SSA receive on their talk pages?.Sep 8 2023, 6:03 PM

ppelberg added a subtask: T345217: Create Special page to manage lists of URLs.Sep 8 2023, 9:14 PM

KStoller-WMF subscribed.Sep 11 2023, 5:56 PM

matmarex unsubscribed.Sep 19 2023, 11:24 PM

ppelberg mentioned this in T346981: Establish baseline for the reliability of references people accompany new content with.Sep 20 2023, 9:12 PM

Nardog subscribed.Sep 26 2023, 9:55 PM

MusikAnimal mentioned this in T347435: Warn when adding a URL that matches blocked external domains in the 2010 editor.Sep 26 2023, 10:19 PM

ppelberg closed subtask T325414: [SPIKE] Generate design concepts to help newcomers assess the reliability of a source they are attempting to add as Resolved.Sep 27 2023, 10:15 PM

ppelberg updated the task description. (Show Details)Oct 2 2023, 9:29 PM

ppelberg mentioned this in T346849: Introduce a machine-readable format for storing source reliability consensus.Oct 2 2023, 9:34 PM

ppelberg mentioned this in T348805: Create dashboard for viewing edits Edit Check was activated within.Oct 12 2023, 8:58 PM

ppelberg mentioned this in T346837: Implement bucketing for Edit Check (references) A/B test.Oct 18 2023, 8:33 PM

ppelberg mentioned this in T342930: [MILESTONE] Run an A/B test to evaluate Edit Check (references) impact.Oct 18 2023, 10:04 PM

ppelberg mentioned this in T349261: Introduce an API to enable editing interfaces to offer feedback about reference reliability .Oct 18 2023, 11:13 PM

ppelberg mentioned this in T350317: [SPIKE] What edge cases will the reference reliability experience need to consider? .Nov 1 2023, 11:22 PM

ppelberg added a subtask: T350322: Design Citoid first-run experience.Nov 2 2023, 12:12 AM

ppelberg renamed this task from Make editors aware when they are attempting to add unreliable sources to an article. to Offer volunteers feedback about the reliability of the source they are attempting to add.Nov 6 2023, 8:09 PM

ppelberg updated the task description. (Show Details)

ppelberg added a project: Editing-team.

ppelberg moved this task from Untriaged to Larger Strategic Things on the Editing-team board.

ppelberg updated the task description. (Show Details)Nov 6 2023, 8:12 PM

ppelberg added a subscriber: SEgt-WMF.

ppelberg updated the task description. (Show Details)Nov 9 2023, 1:19 AM

ppelberg mentioned this in T95390: Citoid's behavior with Wikipedia links is unhelpful (maybe broken).Nov 14 2023, 1:36 AM

ppelberg added a subtask: T95390: Citoid's behavior with Wikipedia links is unhelpful (maybe broken).Nov 14 2023, 1:38 AM

ppelberg renamed this task from Offer volunteers feedback about the reliability of the source they are attempting to add to [Check] Offer volunteers feedback about the reliability of the source they are attempting to add.Nov 14 2023, 8:17 PM

ppelberg removed a subtask: T347531: Define user experience for prompting people to review and replace unreliable sources.Nov 28 2023, 12:11 AM

ppelberg removed a subtask: T349261: Introduce an API to enable editing interfaces to offer feedback about reference reliability .Nov 28 2023, 12:42 AM

ppelberg removed a subtask: T350622: Introduce a new tag to identify edits the Reference Reliability check is shown within.

ppelberg removed a subtask: T350322: Design Citoid first-run experience.Nov 28 2023, 12:45 AM

ppelberg removed a subtask: T346981: Establish baseline for the reliability of references people accompany new content with.

ppelberg removed a subtask: T350314: [CONSULTATION] Learn with volunteers think about the reference reliability design directions.Nov 28 2023, 12:47 AM

Stevietheman subscribed.Jan 5 2024, 12:41 AM

ppelberg added a subtask: T360555: Make volunteers aware of Reference Reliability deployment.Mar 20 2024, 6:12 PM

ppelberg updated the task description. (Show Details)Jun 11 2024, 4:33 PM

ppelberg updated the task description. (Show Details)Jul 25 2024, 7:55 PM

ppelberg updated the task description. (Show Details)