Page MenuHomePhabricator

Most visited domains (pageviews) across all Wikipedia/Wikimedia
Closed, ResolvedPublic



I am looking for this data: the top X (ten or fifteen) most visited domains (pageviews) from all devices (desktop/mobile) across all Wikipedia/Wikimedia domains. So that means across all editions of Wikipedia, Wikidata, Wikimedia, Wikisource ...

What I have tried: The closest I could find was!/Pageviews_data/get_metrics_pageviews_aggregate_project_access_agent_granularity_start_end using which I can generate this data per project and then combine it across projects to get the results I want. However, it seems to be missing a few domains, like when I query for (or, it tells me:

The date(s) you used are valid, but we either do not have data for those date(s), or the project you asked for is not loaded yet. Please check for more information.

Note that I am not looking for individual articles but just domains (and subdomains).

Please let me know if more information is required. Thank you your help.

Event Timeline

ssingh triaged this task as Normal priority.Feb 15 2019, 5:20 PM
ssingh created this task.
Tbayer added a comment.EditedFeb 15 2019, 11:36 PM

However, it seems to be missing a few domains, like when I query for (or

As mentioned (admittedly somewhat obliquely) on the documentation page linked in my email, the pageview data is limited to "production sites", which currently does not include and There is some traffic data for both domains in other places, but we can be pretty certain already that neither of them are in the top 15 domains by pageviews, so it's probably not worth retrieving numbers for these for this purpose.

Here is a first result: the top 15 by pageviews for January 2019, with known bots/spiders excluded. (To get the domain, combine project and access method - e.g. "it.wikipedia" "mobile web" means, "en.wikipedia" "desktop" means

en.wikipediamobile web4455807733
ja.wikipediamobile web732477575
es.wikipediamobile web619856511
de.wikipediamobile web514878297
ru.wikipediamobile web446499662
fr.wikipediamobile web421427313
it.wikipediamobile web407834206
pt.wikipediamobile web202435769

Data via

SELECT project, access_method, SUM(view_count) AS views
FROM wmf.projectview_hourly
WHERE year = 2019 AND month = 1
AND agent_type = 'user'
GROUP BY project, access_method
Tbayer moved this task from Triage to Doing on the Product-Analytics board.Feb 15 2019, 11:52 PM

Thank you for this data! I think we can close this ticket or should we leave it open for automating it (which will be later)?

Tbayer closed this task as Resolved.Feb 16 2019, 12:15 AM

Yes, that should be a separate task (and may require involvement from other teams) .