Page MenuHomePhabricator

Introduce a machine-readable format for storing source reliability consensus
Open, Needs TriagePublic

Description

Per T345118, offering people feedback about how likely experienced volunteers are to consider the source they are attempting to add to be reliable depends on the existence of a machine-readable "list" wikis can use to store/encode the current reliability consensus of a given source.

This task involves the work of proposing a format for said list that is:

  1. Sufficiently explicit to be machine readable and
  2. Expressive enough to hold the nuanced and non-binary nature of WP:RSP and other pages like it [i]

Loose

  • It'll be important that this list include a field for volunteers to be able to specify what feedback message people see when attempting to cite a given source.

i. https://www.wikidata.org/wiki/Q59821108

Event Timeline

This is an interesting, potentially tricky task. The creation of RSP in the first place was controversial, although it's grown more accepted over time. The refrain of opponents is "it's always contextual, and trying to codify it is flattening". Personally, I support a more standardized system, which is what you're going for here, but just be aware that it's a consensus you'll likely have to fight for, and the less expressive the system you propose is, the harder that fight will be.

Some qualities that I'd ideally like to see the system be able to distinguish between are news vs. opinions (e.g. we want to present different advice to someone citing a NYT op-ed than someone citing a NYT news article; the URL would presumably be the key to discerning that) and subject area (e.g. let's say we decide that Al Jazeera is generally reliable except for anything related to the Israeli-Palestinian conflict; we'd want to be able to present a specialized notice to anyone attempting to use it on an article in a subcategory of Category:Israeli–Palestinian conflict; and even then it wouldn't capture a citation related to the conflict within an article that isn't).

And even if we come up with a perfect system, it'll still inevitably involve some amount of rejiggering RSP to fit everything into that system. And given how contentious debates around sources can be, that's not going to be easy.

Just want to add that CredBot may be a way, or hold some options —

https://en.wikipedia.org/wiki/Wikipedia:Vaccine_safety/Reports

Oh, great spot, @FNavas-foundation; I've added CredBot to T276857 as well as to mw:Edit check so that we can revisit it.

This is an interesting, potentially tricky task. The creation of RSP in the first place was controversial, although it's grown more accepted over time. The refrain of opponents is "it's always contextual, and trying to codify it is flattening"...

@Sdkb: this context is extremely helpful – thank you for making the time to share it with us.

Two things below:

  1. Responses to the specific points you raised (as well as a couple of follow-up questions)
  2. How the Editing Team is thinking about moving forward from here towards a future where people receive actionable feedback about the sources they're attempting to add.
Responses

The refrain of opponents is "it's always contextual, and trying to codify it is flattening".

The simplicity of the above is clarifying.

Personally, I support a more standardized system, which is what you're going for here…

This is helpful to know.

Question: Can you please say a bit more about where/why the support you have for such a system comes from? Asked in another way: what positive impact(s) do you think a more standardized system has potential to deliver?

Some qualities that I'd ideally like to see the system be able to distinguish between are news vs. opinions

Noted.

Question: How – if at all – do you currently make these distinctions? E.g. do you need to visit the source and make that call based on what you see?

...subject area (e.g. let's say we decide that Al Jazeera is generally reliable except for anything related to the Israeli-Palestinian conflict; we'd want to be able to present a specialized notice to anyone attempting to use it on an article in a subcategory of Category:Israeli–Palestinian conflict;

Great spot, I've created T347775 for this use case.

Editing Team Proposed Plans

Now, to how the Editing Team is thinking of moving forward…

For the reasons you named above, [i] it seems most tractable to move forward with an initial approach that depends on a facet of reliability policy that is:

  1. Already in a machine-readable format
  2. Widely consented upon

With this in mind, we're thinking of starting with offering people feedback when they are attempting to use a URL as a reference that matches the SpamBlacklist, as this wish from the 2023 Wishlist Survey describes and is an approach that would benefit from the work @Ladsgroup has done in T337431 and @MusikAnimal is doing in T347435.

Longer-term, if we collectively come to find this initial implementation impactful, perhaps it will help prompt a wider conversation about how the "reliability check" might be expanded upon to include a wider range of sources/domains.


i. Specifically, the complexity/challenge of trying to codify a policy that is evolving and non-binary

Can you please say a bit more about where/why the support you have for such a system comes from? Asked in another way: what positive impact(s) do you think a more standardized system has potential to deliver?

Fundamentally, I think organization into standardized frameworks is helpful because it reduces complexity. This facilitates machine-reading (including Edit Check, but also any other tools that might want to use the RSP list) as well as makes things simpler/easier for humans browsing the list. For instance, browsing through RSP, I see a bunch of small wording variations that might be considered quirks or point to potential ambiguities: Why does the CS Monitor entry say it is "generally reliable for news" whereas other entries like The Age just say "generally reliable", full stop? And why does the NYT's entry include a special note that opinion pieces should be governed by WP:RSOPINION whereas the Financial Times' entry does not, even though it also publishes opinion? I think a standardized system would help iron out these wrinkles and force us to confront ambiguities that may be causing confusion. Ultimately, it'd lead to the summary section of RSP becoming more regulated/optimized, which would make it easier to parse. It'd also be clarifying for debates on source reliability. In the past, those debates had a certain reinvent-the-wheel aspect, which could make consensus less clear given that different editors might interpret something like "generally reliable" differently. More recently, those debates have moved toward a more standardized four-option menu, as e.g. here. Having that common vocabulary makes it easier to say e.g. "this source is about as reliable as that other one, so let's treat it similarly". A more complete standardized framework would effectively expand that common vocabulary.

Question: How – if at all – do you currently make these distinctions? E.g. do you need to visit the source and make that call based on what you see?

The easiest cases are those for publications that clearly distinguish between news and opinions (more often newspapers). For these, it's just a matter of visiting the source, seeing whether the URL/section label is "Opinion" or a news department, and going by that. It gets trickier with publications with blurrier lines (more often magazines). For those, it's often necessary to read the article to figure out what type it is. And in some cases, for "analysis"-type edge cases, editors may disagree about whether a statement in a source is opinion or fact.

Your initial approach sounds good to me. Overall, it'll be easiest to work from the bottom of the reliability scale up (blacklisted sources being at the very bottom), since at that extreme there are very few exceptions where it would become okay to use a source. (If you want to know what those exceptions are, then search for whitelisted uses of currently blacklisted sources.) Greenlit "generally reliable" sources would be the second-easiest category to machine-read. It's the yellow "marginally reliable" sources that will be the greatest challenge, since they have a bunch of caveats and nuances. We'll eventually want to be able to parse them, but for a minimum viable product they seem better to avoid. Browsing through the entries there will give you a sense of what the caveats and nuances tend to be. The specific language used is important ("no consensus" means editors were unable to agree, whereas "marginally reliable" means editors agreed that the source is marginal) but it tends to fall into patterns.

ppelberg renamed this task from Propose a format for storing how reliable a project considers a source to be to Introduce a machine-readable format for storing how reliable a project considers sources to be.Nov 9 2023, 1:06 AM
ppelberg renamed this task from Introduce a machine-readable format for storing how reliable a project considers sources to be to Introduce a machine-readable format for storing source reliability consensus.

Re: 1.A, this message is probably going to be very similar to the existing spamprotectiontext message, but we can't just use that one because (a) its default wording expects to be talking about a whole attempted-edit and so is a bit too vague ("probably caused by a link to a forbidden external site"), and (b) some wikis (looking at you, enwiki) have customized it to be huge and so it would cause display issues squeezing it into the citoid dialog.

Re: 2, we could say there's six grades of referenced-website, which mostly relates to the categories in the legend in the perennial sources page. In rough order of "goodness":

  1. Generally reliable.
  2. Unknown; nobody has made a rule against it, but could be unreliable, so we shouldn't say it's "good". This is going to cover a lot of citations to information on minor websites.
  3. Situational ("no consensus"); sources for which you need to read the warning. (To pick an early example from the list: "Arab News is reliable unless the article is about the Saudi Arabian government".)
  4. Generally unreliable; you probably shouldn't use this, but it's not actually forbidden.
  5. Deprecated; you can only use this for self-descriptions, i.e. referencing the fact of the content existing on the site
  6. Blocked; you literally cannot add this to the wiki.

The last one is the only one we should actually block people from adding, because there's occasional valid reasons to use all of the others.