Page MenuHomePhabricator

Wikistats 2.0: "aa.wikipedia.org" exists and has data available, but marked "Invalid"
Closed, ResolvedPublic5 Estimated Story Points

Description

But it seems that on https://stats.wikimedia.org/v2/ one cannot look for aa.wikipedia.org through the UI. Typing it regardless seems to force use of the first suggestion, viewing data from the unrelated kaa.wikipedia.org domain.

I tried working around it by manually changing the url and reloading does work at https://stats.wikimedia.org/v2/#/aa.wikipedia.org and produces the correct data from the backends, but the interface marks the input field with "INVALID".

There are a few other (closed) wikis that exist, have page view data available, but are not suggested in the interface, including https://za.wiktionary.org (zawiktionary).

On the other hand, there some closed projects that are listed in Wikistats 2.0, but actually do not have pageview data at the moment. Such as https://stats.wikimedia.org/v2/#/usability.wikimedia.org which is suggested as valid target, but results in "Something went wrong" displayed instead of graphs. (For this wiki, it seems RESTBase/metrics has no data, not sure why.)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Krinkle updated the task description. (Show Details)

This wiki needs to be added to the sqoop list for it to appear as valid.

@fdans Where is this list maintained, and what is the process for updating it? Given this is not a new wiki, it suggests that this list is intentionally a subset of the actual wikis we have. If so, what is the reason for that?

fdans triaged this task as Medium priority.Mar 26 2018, 4:40 PM
fdans moved this task from Incoming to Smart Tools for Better Data on the Analytics board.

@Krinkle sqooping is different per wiki size, thus it requires a whilelist to manage it. See similar addition: https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/445445/

Nuria raised the priority of this task from Medium to High.
Nuria added a project: Analytics-Kanban.
Nuria moved this task from Smart Tools for Better Data to Wikistats Beta on the Analytics board.

@Krinkle: usability.wikimedia.org does not have pageviews cause i suspect we have a bug on our end and we are not parsing this domain, if we were we should have received an alert as we do for every domain for which we see pageviews but not process them. This is the list for wikis for which we process pageviews (we need a whitelist as we do not want to report data for wikis that are private): https://github.com/wikimedia/analytics-refinery/blob/master/static_data/pageview/whitelist/whitelist.tsv

Change 455894 had a related patch set uploaded (by Fdans; owner: Fdans):
[analytics/refinery@master] Add usability.wikimedia to pageview whitelist

https://gerrit.wikimedia.org/r/455894

Change 455894 merged by Nuria:
[analytics/refinery@master] Add usability.wikimedia to pageview whitelist

https://gerrit.wikimedia.org/r/455894

@Nuria I've been checking sites listed under "special" in the sitematrix that aren't private. As far as I can tell, the sites that are slipping through the cracks are:

  • usability.wikimedia.org
  • strategy.wikimedia.org
  • wikimania.wikimedia.org

Plus all the yearly wikimania sites. Do we want to add these? In which case they all follow the same url structure, so it would be easy to integrate them in the regex.

In the case of aa.wikimedia.org, we should probably change the "INVALID" label to "CLOSED". Will open a separate task for this.

@fdans : we should probably open tickets to refactor all regexes but we can do that at a later time. Since we have not had any requests ever for the wikimania sites , let's add the other two. Thanks for going through the list.

Change 456022 had a related patch set uploaded (by Fdans; owner: Fdans):
[analytics/refinery/source@master] Fix strategy and usability sites not being counted as pageviews

https://gerrit.wikimedia.org/r/456022

Change 456022 merged by Fdans:
[analytics/refinery/source@master] Add strategy, usability and advisory sites to pageview definition

https://gerrit.wikimedia.org/r/456022

usability data is still not showing on wikistats

I will live this ticket open until next month, we have started computing pageviews for usability.wikimedia and you can see there are pageviews available for the daily range here: https://wikimedia.org/api/rest_v1/metrics/pageviews/aggregate/strategy.wikimedia.org/all-access/user/daily/2018090100/2018091400

However, there is still not 1 month full of pageviews and, until that happens the wikistats UI (which agreggates monthly data for both pageviews and unique devices) will not work

mforns set the point value for this task to 5.Oct 8 2018, 4:03 PM

za.wiktionary.org , usability.wikimedia.org and aa.wikipedia.org are now selectable options on the drop-down menu, there is data (for pageviews) for the last month: https://stats.wikimedia.org/v2/#/usability.wikimedia.org/reading/total-page-views/normal|bar|3-Month|~total

https://stats.wikimedia.org/v2/#/za.wiktionary.org