Page MenuHomePhabricator

Countable content namespaces evaluation (tracking)
Closed, ResolvedPublic

Description

+++ This bug was initially created as a clone of Bug #39866 +++

Erik Zachte gave in bug #39866 comment #12 the following status:

Here is up to date list of countable namespaces, collected via api.
http://stats.wikimedia.org/wikimedia/misc/StatisticsContentNamespaces.csv
Mentioned above but not yet effective:
wp:hr:102
wp:fr:104
wp:lt:104
wp:als:104
Suggested in
http://infodisiac.com/blog/2012/08/growth-in-article-count-at-largest-20-wikipedias/
but not yet effective:
wp:ru:102

I'm opening a new tracking bug from this comment.

A new bug by wiki should be created if the local community wishes to act on this (for example, I don't think fr:Reference: should be a content namespace).


Version: unspecified
Severity: enhancement
URL: http://infodisiac.com/blog/2012/08/growth-in-article-count-at-largest-20-wikipedias/

Details

Reference
bz40423

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 1:04 AM
bzimport set Reference to bz40423.
bzimport added a subscriber: Unknown Object (MLST).

For fr., the community seems to *currently* consider Reference: as a metadata namespace, a collection of bibliographic notices.

This is highlighted in the following community informal consultation:
https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Sondage/Statut_de_l%27espace_de_nom_R%C3%A9f%C3%A9rence


Still to process:

  • hr:Dodatak (102)
  • lt:Sąrašas (104)
  • als:Wort (104)
  • ru:Инкубатор (102)

Erik, are you willing to open such a consultation on hr. lt. als. or would you like I find local people to do so?

I had posted an analysis on the blog post as comment, did it get lost?
The situation on most wikis was pretty clear after a little (aka 2h?) investigation, but of course I can't remember anything...

(In reply to comment #2)

I had posted an analysis on the blog post as comment, did it get lost?
The situation on most wikis was pretty clear after a little (aka 2h?)
investigation, but of course I can't remember anything...

Could you check if it weren't on bug 39866? I cloned it into this tracking bug.

erikzachte wrote:

@Dereckson I'd prefer to just follow community consensus. The idea is that Wikistats will pick up changes automatically via the API. I wonder why this is on per wiki basis. Doing this one wiki at a time is long road ahead. BTW I can the refresh the csv list whenever needed.

The issue is it's the local project responsibility to determine if a namespace is or not a content one.

So we've to ask to these projects their thought on the matter.

(In reply to comment #2)

I had posted an analysis on the blog post as comment, did it get lost?
The situation on most wikis was pretty clear after a little (aka 2h?)
investigation, but of course I can't remember anything...

Oh the *blog* post. Erik > you seem to have an issue with your blog comments system.

erikzachte wrote:

@Nemo_bis, I found your blog comment in the spam folder. Sorry for that. It is now online.

(In reply to comment #7)

Still to process:

  • lt:Sąrašas (104)

This contains mostly date-related lists, could perhaps be considered content if the wiki wants (doesn't affect stats much).

  • als:Wort (104)

als.wiki has several namespaces like "Text" etc. which seems to contain a Wiktionary and a Wikisource; they could be considered content and this wouldn't change stats much, but then the comparison to other "pure" Wikipedias would be quite unfair, plus the wiki would become more confusing to use. If they want to consider it content, perhaps it could be accepted.

We could notify those two wikis so that they can request it. In general, it looks like using content namespaces (as defined in the config) for the stats everywhere would be correct and not causing problems.

The hr. community has confirmed the namespace 102 is to be considered as a content namespace, see bug 40732 for the config change.

Instead of using the api for 800+ wikis, it may be simpler to query wgContentNamespaces from InitialiseSettings.php

Scripts have been updated (some time ago) to query api for content namespaces.
Scripts have been updated recently to use this info during dump processing

A few namespaces which were always considered content namespaces are added on top of this, even when some are not returned (yet) by api:
strategy: 106, commons 6 and 14, wikisource 102,104,106

I plan to cut-over to new scheme July 1, at the start of new administrative year

Very nice, waiting some more weeks is no big deal.

If the Commons and Wikisource namespaces are not all correctly flagged as content ns, a bug should be filed for each.

Aklapper added a subscriber: Aklapper.

[adding the Tracking-Neverending project to tasks blocking (now deprecated) T4007 as part of T93366]