Page MenuHomePhabricator

Change Article::isCountable() check method to more flexible and reliable
Closed, ResolvedPublic

Description

Good article is now defined (discounting the optional weird comma rule) by:

good article = page in ns 0 AND not redirect AND not dead end

However, Article::isCountable() does check for presence of literal '[[' in source text, thus it also counts pages without links, but containing:

  1. [[File:Foo]]
  2. [[Category:Foo]]
  3. <nowiki>[[</nowiki>
  4. even <!-- [[ -->

and on the other hand it does not count any page which transcludes templates which generate links (therefore the page is no longer dead end).

Because of that, lots of projects, namely Wikisources, Wiktionaries and Wikibookses are "hacking" their wikitext sources by methods 2-4 described above to have involved pages counted.

This method is completely useless because of two reasons: first is mentioned above and second is, that there is no efficient way how to update counter by running maintenance scripts which - if they want to follow the same method used in Article::isCountable(), which they apparently should to keep the consistence - would have to load texts.

So the method should purely rely on registered links. Thus it would be comfortably re-checked (counters re-counted) anytime.

The second issue is the optional weird "comma rule". Like the previous method, it relies on texts, however, unlike the previous method it's not replaceable by any other possibility using other tables (such as pagelinks in previous case) but always have to work with texts.

It has been discussed on random places several times that this method is even more useless and less reliable then check for presence of links (because it's possible to write quite long article without any comma at all as well as e.g. article "Pi" saying only "3,14", not even speaking about languages which do not use comma) and thus it should be removed completely (yay! another worthless and useless config variable away!).


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=11868

Details

Reference
bz24754

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:07 PM
bzimport set Reference to bz24754.
bzimport added a subscriber: Unknown Object (MLST).

Perhaps this should be discussed before: you're actually proposing a new method to count articles which is more restrictive (or anyway different) than the current one. Articles which include a file or a category should arguably counted as "good"; perhaps articles which just include some template shouldn't.
And bug 26033 proposes yet another method.

(In reply to comment #0)

It has been discussed on random places several times that this method is even
more useless and less reliable then check for presence of links (because it's
possible to write quite long article without any comma at all as well as e.g.
article "Pi" saying only "3,14", not even speaking about languages which do not
use comma) and thus it should be removed completely (yay! another worthless and
useless config variable away!).

It's still there just for backward compatibility, what's the problem?