Page MenuHomePhabricator

Deploy article quality labeling campaign for Portuguese Wikipedia
Closed, DuplicatePublic

Description

How do Wikipedians label articles by their quality level?

The Template:Marca de projeto is placed on article talk pages and accepts parameters for indicating the importance of the article for each wikiproject, and a general quality of the article (independent of the wikiproject).

From what I remember, most articles were not evaluated by humans, so some years ago we created this Lua module:
https://pt.wikipedia.org/wiki/Module:Avalia%C3%A7%C3%A3o
It estimates the level of quality of an article, and is used as a fallback in the cases where no user informed the quality explicitly (and the article is not tagged as a good/featured article). The module follows some hardcoded rules based on a few features of the pages, but the threshold for these rules were not chosen based on any statistics of the evaluated. E.g. the module does not return quality >=3 for articles with less than 12000 bytes, but this is just a "random" number, not some "average" of the existing human evaluations for the articles prior to the creation of this automatic assessment method.

Also, we have a tool on Labs which provides us general statistics about the articles of a given Wikiproject. E.g. for Mathematics, it produces this table:
http://tools.wmflabs.org/ptwikis/Matriz:matem%C3%A1tica

Also, @Danilo had some Python code for dealing with the templates:
https://pt.wikipedia.org/wiki/Usu%C3%A1rio(a):Danilo.bot/marcas.py#C.C3.B3digo
Maybe that can help to understand/parse the parameters...

What levels are there and what processes do they follow when labeling articles for quality?

Portuguese Wikipedia uses 6 levels to classify articles/lists by quality:

  • 1: lowest level
  • 2: ...
  • 3: ...
  • 4: ...
  • 5 or AB: good article, requires voting by community (documented at WP:EAD)
  • 6 or AD: featured article (highest level), requires voting by community

and there are also 4 levels for the importance: 1, 2, 3 and 4 (highest)

How do InfoBoxes work? Are they used like on English Wikipedia?

  • Most infoboxes use Template:Info (see its subpages)
  • As far as I know, our infoboxes are still manual, and do not show data from Wikidata yet.

Are there "citation needed" templates? How do they work?

For inline notes about facts needing citations, we have

and also

for larger banners on top of pages/sections.
There are a few other variations on Category:!Predefinições sobre fontes em falta, in case that is relevant.

Event Timeline

Halfak triaged this task as Medium priority.Jul 5 2016, 2:29 PM
He7d3r renamed this task from Article quality models for ptwiki to Train/deploy article quality models for ptwiki.Oct 21 2016, 7:04 PM
He7d3r updated the task description. (Show Details)
Halfak renamed this task from Train/deploy article quality models for ptwiki to Deploy article quality labeling campaign for Portuguese Wikipedia.Nov 3 2016, 2:09 PM
Halfak edited projects, added articlequality-modeling; removed editquality-modeling.
Halfak subscribed.

Took a look at this and did some cleanup. I think the best next step is looking into how many articles get "ab" and "ad" labels. If there's enough, we can probably just ask editors to label articles that are beneath that quality level.

(...) looking into how many articles get "ab" and "ad" labels. If there's enough, (...)

I'm not sure I understood. Could you clarify? "get" how/when? (e.g. how many articles got these labels so far, manually? how many would get these labels from a model? or something else?)

Hey! Sorry for the late response. I've been a bit overloaded recently. So, when I say "get", I mean "how many articles got these labels so far, manually?".