Page MenuHomePhabricator

Offer map and queries by subdivisions of countries
Open, MediumPublicFeature

Description

Need: As a curious Wikipedian interested in the use of languages online, I would like Wikimedia Statistics (and the Wikimedia REST API) to display results by "big regions" instead of countries only, so that I can have more granularity in my analyses and identify the most used edition by region.

More on this:

Currently, Wikimedia Statistics offers data by "country" (more exactly, by ISO 3166-1 alpha code, so some jurisdictions that are not countries have their own code). Using this feature (and some code), I updated existing maps showing the most used edition of Wikipedia. Here's the result. It's now used on several Wikipedias. (if you're curious, I wrote an article about this.)

Tomasz Kamusella, a researcher whose focus is the use of languages in cyberspace, said that this map lacked granularity:

  • "The picture of the situation could be delivered in an improved 'resolution' if India's regional states and China's or Russia's autonomous republics could be treated as separate entities. Otherwise, we have a unit on the map for Malta with 0.4m inhabitants, but not for West Bengal with 180m inhabitants..."
  • "granularity in data presentation is a problem when only states are employed as a standard unit for this purpose. Hence, it is important to think about developing regional/state maps for such regions/states (South Asia/India), if they contain a quarter of the world's population. Otherwise, users can see exact and finely tuned data on Belize, Malta or Slovenia, but not on West Bengal within India."

I agree with Dr. Kamusella: millions of users in India, Pakistan, Bangladesh (and elsewhere), and their languages, are underrepresented in the current version of Wikimedia Statistics, and in the resulting map(s). Languages spoken in this region are also quickly growing so it's important to have a correct picture of their use.

One solution could be to use ISO 3166-2 codes on the API. Actually, because Wikimedia Statistics uses ISO 3166-1, some subdivisions of ISO 3166-2 are already supported (Svalbard, Aruba, Puerto Rico, French Polynesia, etc.).

This feature could first be introduced only for "big countries" with well-established subdivisions, where it may be easier:

  • US states,
  • Canadian provinces,
  • Russian Republics,
  • Indian states,
  • Chinese provinces (even though the ban of Wikipedia may render this feature useless there)

(FYI: I initially mentioned the idea of such a feature request here: T257071 ).

image.png (1×1 px, 350 KB)

Event Timeline

A455bcd9 renamed this task from Map and queries by subdivisions of countries to Offer map and queries by subdivisions of countries.Jun 4 2021, 11:29 AM
A455bcd9 updated the task description. (Show Details)
A455bcd9 updated the task description. (Show Details)
Aklapper changed the subtype of this task from "Task" to "Feature Request".

Hi @A455bcd9, thanks for taking the time to report this. I assume this is about stats.wikimedia.org (please follow https://www.mediawiki.org/wiki/How_to_report_a_bug when creating tickets).

A455bcd9 updated the task description. (Show Details)

Hi @Aklapper, yes, I've just added a link to Wikimedia Statistics in my post to make it clearer (and I mentioned the Wikipedia REST API as well). I tried to follow this page when creating the ticket. Could you please tell me what I forgot? For feature requests, this page only mentions that it should be a user story.

It wasn't intentionally clear where to expect this; now there is a link now to stats.wikimedia.org in the description. Thanks!

Ottomata triaged this task as Medium priority.Jun 7 2021, 3:30 PM
Ottomata moved this task from Incoming to Wikistats on the Analytics board.