Page MenuHomePhabricator

[Suggestion/Signal] Surface information about readability
Open, Needs TriagePublic

Description

Prompted by an an offline discussion with @Sucheta-Salgaonkar-WMF and @nayoub and an underlying need to make the content Wikipedia offers, "...more readable and accessible, and thus easier to discover and learn from..." (source), this task involves the work of introducing a signal/suggestion that would make people editing in VE (and maybe one day, reading (T409589)) aware of the reading length and complexity of a given portion of text and invite them to consider making changes that would reduce both.

Note: were we to implement this as a suggestion, I could imagine a similar kind of Recheck interaction that we implemented with Tone Check that could enable someone to see, in real-time, the extent to which the changes they've made are effective at increasing the readability of the content they were editing.

Story

  • As someone editing in VE who enjoys making copy edit-style contributions, I'd value knowing when a given paragraph/section exceeds a reading time/complexity threshold established by Wikipedia volunteers so that I can more easily judge whether the content is likely to be accessible to a broad audience and adjust wording, structure, length, etc. to improve its readability.
  • As someone visiting Wikipedia to learn about a topic/person/concept/event/team/film/law/anatomy/etc. I want the content I’m reading to be written in clear, approachable, and straightforward way so that I can understand the information be presented to me and minimize the amount of time I need to go searching for an explanation that meets my learning needs.

Proofs of Concept

image.png (1×3 px, 1 MB)

References

  • en:Template:Prose + en:Use prose where understood easily via @SherryYang-WMF
  • Multilingual_readability_model_card
    • Per offline discussion with @MGerlach:
      • The model is available on LiftWing
      • An offline evaluation demonstrated the model can distinguish between simple and difficult versions of the same article across 14 languages (taken from simplified and children encyclopedias), yielding accuracy >90% for many languages.
      • The model readability score for an article is highly correlated with the usage of templates indicating difficult-to-understand language such as {{Confusing}} (This article may be confusing or unclear to readers) or {{Technical}} (This article may be too technical for most readers to understand.) For an example, see this list of the top-1000 articles with the highest readability scores (i.e. most difficult to read) containing these templates.
      • Currently, the model is currently only scoring the text of the lead section of an article. Although, a relatively modest amount of work could be done to pass the model any arbitrary span of text (e.g. paragraphs).

Event Timeline

ppelberg added a subscriber: NBaca-WMF.
ppelberg added a subscriber: Miriam.
ppelberg removed a subscriber: Aklapper.
ppelberg added a subscriber: derenrich.
ppelberg renamed this task from [Suggestion/Signal] Surface information about the reading time or reading level to [Suggestion/Signal] Surface information about readability.Dec 5 2025, 5:11 PM
ppelberg updated the task description. (Show Details)