Page MenuHomePhabricator

Build a taxonomy of issue templates
Closed, ResolvedPublic

Description

Scope

  • Wikipedia
  • Issue templates only (as opposed to including patrolling)
  • Set of languages from the quantitative piece.
  • Article namespace

Output

  • an operationalizable taxonomy of issue templates in the form of a Google document. This will feed into T384600.

We make the taxonomy operationalizable by working in conjunction with T384600 (using the collected metrics that inform how we want to categorize different types of distributed moderation, and by comparing patterns in the metrics to our existing knowledge of distributed moderation workflows). We also aim to understand how these metrics should be most effectively deployed.

Impact
This work will inform what meta data we should collect.

Background
Following on from T376684, we have identified a large number of moderation actions that may be done by users without high levels of Wiki-specific expertise or hard-to-acquired user rights. We tentatively refer to these actions as "distributed moderation": moderation actions that can be done by a large number of users, with a low barrier to entry.

Examples of such distributed moderation tasks might include adding or resolving issue templates (for example, using {{Citation needed}} to flag up a missing reference), intermittent edit patrols, or maintenance of moderation work queues.

Some questions raised that we may wish to tackle:

  • How do editors become aware of the existence of distributed moderation activity? How might they learn to participate in distributed moderation?
  • What cross-wiki similarities and differences exist, in distributed moderation?
  • Does distributed moderation lead to significant redundancy, and does this impact its effectiveness (on multiple fronts)?
  • Are current signals for distributed moderation, such as the use of issue templates:
    • ...effective at making distributed moderation tasks visible for would-be moderators?
    • ...effective at causing the desired changes within the flagged section?

Details

Due Date
Mar 31 2025, 12:00 AM

Event Timeline

leila triaged this task as Medium priority.Jan 13 2025, 8:48 PM
leila renamed this task from Explore distributed moderation tasks on-wiki to Build a taxonomy of issue templates.Mar 4 2025, 9:42 PM
leila added a project: Essential-Work.
leila updated the task description. (Show Details)
leila added a parent task: T371865: Who are moderators?.
leila updated the task description. (Show Details)
leila set Due Date to Mar 31 2025, 12:00 AM.

Update: Based on the identified most-commonly-used templates from T384600, I have begun cataloguing and categorizing templates based on:

  • Whether or not the template is a redirect
  • The purpose or main intended use of the template (broadly categorized as "citation-related", "maintenance-related", "style-related", or "other")
  • Whether or not the use of the template seems to fit our definition of crowdsourced or distributed moderation

At a preliminary glance, we can see that German Wikipedia is an outlier in that this wiki rarely uses inline or messagebox templates for citation, article maintenance or style-related issues (e.g. an equivalent to a "citation needed" or "dead link" template). I presume their heavy use of the FlaggedRevision extension, and the fact that this means every edit is de-facto checked before publication, limits their utility - although Polish Wikipedia (which likewise uses FlaggedRevisions in a similar manner) does use inline templates for this purpose. Based on comments such as this, it seems that German Wikipedia avoids using such "cleanup templates" by policy.

Otherwise, on all other wikis, these inline templates can be broadly categorized as:

  • Citation-related: warnings that a particular claim, section, or page requires attention to its citations. This might mean that there are no citations, too few citations, or low-quality citations.
  • Maintenance-related: for our purposes, this largely relates to issues like dead links or orphaned pages.
  • Style-related: warnings that a sentence, section or page has stylistic or tone issues. Examples include non-encyclopedic tone, unclear or vague claims, or claims that don't specify a time or an author for the claim being made. This would cover templates such as "clarify", "who" and "when".
  • Other: all other templates that are in the dataset.

There are some templates in the dataset that have less clear-cut moderation uses, particularly around templates that mark a page/section as "about a current, or future, event", which I am looking into to better understand how they're used.

Update: Making steady progress on categorization, finished 9/12 wikis.

Third pass of categorization is complete. The regex and notebook are available on Gitlab pending merge request. These outputs are also copied into the templates spreadsheet from T384600#10637277.

Final categories used (19 total). Templates were categorized based on their available template documentation, especially where that documentation tied that template to specific policy articles on their wiki.

  • Reliability: templates related to the principle of source reliability.
  • Verifiability: templates related to the principle of source or claim verifiability.
  • Original research: templates related to the prohibition of original research as a valid source.
  • Citation: other templates related to citation policies.
  • Translation: templates around requesting, expanding, or correcting translations.
  • Collaboration: templates indicating work-in-progress on a given page, or those that explicitly invite another user's input or contributions (even if they are non-specific about who exactly should contribute).
  • Maintenance: templates related to the organization of articles, including page moves, renames, merges, deletions, disambiguation, categorization, dead links, and orphan articles.
  • Clarity: templates that point out issues of clarity, or any situation where the text is difficult to understand.
  • Accuracy: templates that point out issues with the factual accuracy of claims within a page.
  • Formatting: templates that point out non-standard or incorrect formatting of pages and sections.
  • Grammar: templates for copy-editing, minor typographic errors, spelling and grammar issues.
  • Style: other templates related to style, tone, and incorrect or non-standard formatting.
  • Language bias: Templates that point out biased language, such as non-neutral, fannish or in-universe points of view, editorializing, or promotional language.
  • Content bias: Templates that point out biases in the structure of an article beyond language, such as unbalanced representations of controversies, or undue weight given to particular perspectives.
  • Neutrality: templates that point out issues with bias and neutrality in a page, not covered by "language bias" or "content bias".
  • Notability: templates about notability, or whether or not a given subject is suitable to be on Wikipedia
  • Copyright: templates about copyright concerns or violations.
  • Paid editing: templates about undisclosed conflicts of interest or undisclosed paid editing, grouped together due to the similarity of policies in addressing both situations.
  • Multiple: templates that are routinely used to indicate issues across two or more high-level categories, or placeholder templates that nevertheless indicate some kind of problem within the flagged text.
  • Other: all other templates in the dataset. The most notable examples were those are used to format pages in and of themselves, such as infoboxes, reference formatting templates, or purely communicative templates used to convey warnings or commendations.

The table below shows aggregated statistics on categorized templates, across all wikis in the dataset except dewiki and arzwiki. Given the very low relevant-template usage on these two wikis, I do not think they will significantly alter findings. The templates spreadsheet has a more complete version of this table.

categorytemplate_numbertemplate_countrevision_countpage_count
citation1430226434165671149864
multiple2961190276855063481
maintenance1068934448024777657
other5762751895273947953
collaboration415263072165719779
style992220081922418205
clarity907207421826517357
verifiability2371042085988108
translation464955984577934
formatting113727352614939
notability104654964786294
reliability146451033443125
original_research95280321662061
copyright59200919751870
accuracy74172317001658
grammar61119011721132
language_bias68808802780
paid_editing32754754734
neutrality59444418400
content_bias16878783
  • template_number: count of templates in the category
  • template_count: absolute number of additions + removals of this template, across selected wikis, in the dataset
  • revision_count: as above, but for number of revisions that include a template within the category
  • page_count: as above, but for number of pages that have had a template in that category added or removed

Key findings

  • German Wikipedia is an outlier due to not using article maintenance templates as a general policy. The only classifiable template from dewiki in our dataset were two deletion templates, which redirected to the same "base" template.
  • Egyptian Arabic Wikipedia is also an outlier due to low general maintenance template usage
  • English (n=1877), Russian (n=729) and Chinese (n=472) Wikipedias used the most unique templates.
  • Every Wikipedia included some un-translated templates. The most common case was a non-English wiki, using un-translated English Wikipedia article maintenance templates.
    • The most common untranslated English template was the citation needed template, or its aliases cn and fact
    • The majority of templates used on Chinese Wikipedia are written in Latin characters only (n=538, 74.3%) versus those including non-Latin characters (n=186, 25.7%). In practice, this is a split between English versus Chinese templates (with one written in Japanese). Presumably, this helps avoid privileging Traditional or Simplified Chinese use in template names.
  • Template usage does not map neatly onto identified policy violations. The most commonly used templates are also the most broad, and as the table above shows, the broadest or most imprecise categories see the most use. Rather, templates seem to be used to raise up that there is an issue, not necessarily why it is an issue.
    • Different wikis also have different mappings between templates and policies. For example, French Wikipedia's Interprétation personnelle template documentation indicated it was suitable in cases where original research was used, where unpublished or unverifiable sources were used, or when a claim seemed to come from personal interpretation. On English Wikipedia, the same situations would be covered by the separate templates Original research, Opinion, Synthesis, or Self-published source.

@Isaac please review for sign-off

Thank you - done!

Template usage does not map neatly onto identified policy violations. The most commonly used templates are also the most broad, and as the table above shows, the broadest or most imprecise categories see the most use. Rather, templates seem to be used to raise up that there is an issue, not necessarily why it is an issue.

Such an important takeaway thank you! Points to our need to think more critically about what sorts of tasks these templates are useful for and how to bridge the gap between tagged content and the desire to clearly explain to editors what the issue is and how to resolve it.

I'll add something you've shared with me separately too which is that not all language editions had examples from our dataset where a particular category of template was used -- e.g., copyright-related templates only occurred on a few wikis. Also an important area for future thought/study as we think about the ability of these templates to serve as a data source for training models to help detect issues and flag them to editors.