Page MenuHomePhabricator

[Task] Add Special:AboutTopic view stats to grafana for ArticlePlaceholder
Closed, ResolvedPublic13 Story Points

Description

We would like to have view stats for ArticlePlaceholder (Special:AboutTopic/Q* and localized versions) in Grafana.

To get a better idea about usage patterns, being able to see data for a single wiki would be useful. Also it would be useful to differentiate between bot/ spiders and real users (especially for the future, where we plan to add placeholders to search engines).

Event Timeline

hoo created this task.Jun 23 2016, 3:05 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJun 23 2016, 3:05 PM
hoo updated the task description. (Show Details)Jun 23 2016, 3:05 PM

Change 295896 had a related patch set uploaded (by Addshore):
Add WikidataArticlePlaceholderMetrics

https://gerrit.wikimedia.org/r/295896

Lydia_Pintscher triaged this task as High priority.Jun 24 2016, 2:50 PM
Lydia_Pintscher moved this task from Incoming to Doing on the WMDE-Analytics-Engineering board.
Lydia_Pintscher added a subscriber: ChrisPins.
Lucie moved this task from Incoming to Review on the ArticlePlaceholder board.Jun 24 2016, 2:56 PM

@Lydia_Pintscher how exactly do you want the data split? So per wiki of course, but how about in regards to bots spiders and users. Do you just want 1 number combining bots & spirder and then a second number for users? Do we actually care about bots & spiders at all and could we then just have one number which is just for real users?

Addshore moved this task from incoming to ready to go on the Wikidata board.Jun 26 2016, 2:58 PM

I think the split would be good to have so we can see if there is anything strange going on with spiders as well. But the most important number is actual users.

Izno added a subscriber: Izno.Jun 27 2016, 12:41 PM

I may be really dumb asking this question (and perhaps in the context of this task), but why is Special:AboutTopic/* indexed?

I may be really dumb asking this question (and perhaps in the context of this task), but why is Special:AboutTopic/* indexed?

How do you mean indexed?

Izno added a comment.Jun 27 2016, 1:14 PM

How do you mean indexed?

"how about in regards to bots spiders and user"

Good-faith spiders who obey NOINDEX won't be showing up on this page, so my presumption, given the question is that NOINDEX is not being set on Special:AboutTopic. Is that true, and why (not)?

It doesn't look like Special:AboutTopic is in the robots file.
That question is probably best directed at @Lydia_Pintscher :)

The pages should be indexed by search engines. That is one of the biggest things we need to do with it in order to reach more readers for these small Wikipedias and then turn some of them into editors.

Addshore renamed this task from [Task] Add ArticlePlaceholder view stats to grafana to [Task] Add Specia:AboutTopic view stats to grafana for ArticlePlaceholder.Jun 28 2016, 12:42 PM

Change 295896 merged by Joal:
Add WikidataArticlePlaceholderMetrics

https://gerrit.wikimedia.org/r/295896

hoo added a comment.Jul 6 2016, 1:10 PM

@Addshore: What's the status here? I can't see any (useful, non-testing) data in Graphite.

Addshore renamed this task from [Task] Add Specia:AboutTopic view stats to grafana for ArticlePlaceholder to [Task] Add Special:AboutTopic view stats to grafana for ArticlePlaceholder.Jul 6 2016, 1:25 PM
Addshore added a subscriber: JAllemandou.

@Addshore: What's the status here? I can't see any (useful, non-testing) data in Graphite.

2 patches to get merged and then switched on!
I'm working with @JAllemandou to get them merged :)

Change 296407 had a related patch set uploaded (by Addshore):
Ooziefy Wikidata ArticlePlaceholder Spark job

https://gerrit.wikimedia.org/r/296407

Change 296407 merged by Ottomata:
Ooziefy Wikidata ArticlePlaceholder Spark job

https://gerrit.wikimedia.org/r/296407

This is now just pending deployment which should happen in the next 48 hours! :)

Addshore closed this task as Resolved.Jul 13 2016, 10:13 AM
Addshore moved this task from Active 🚁 to Closing ✔️ on the User-Addshore board.

All deployed.
Graph added at https://grafana-admin.wikimedia.org/dashboard/db/article-placeholder
Data is now being backfilled since the first day of the AP deployment.

Addshore set the point value for this task to 13.Jul 13 2016, 10:14 AM
Addshore moved this task from Proposed to Done on the TCB-Team-Sprint-2016-06-29 board.
Addshore reopened this task as Open.Jul 13 2016, 11:35 AM
Addshore moved this task from Closing ✔️ to Active 🚁 on the User-Addshore board.

So the patches worked and the graph exists however after some of the data filled in we realised that we were not taking into account namespace aliases in other languages.
So the data collected is far below the actual numbers.

Change 298717 had a related patch set uploaded (by Addshore):
Include the namespace for all pages

https://gerrit.wikimedia.org/r/298717

Change 298719 had a related patch set uploaded (by Addshore):
Include the resolved special page name for special pages

https://gerrit.wikimedia.org/r/298719

Change 298723 had a related patch set uploaded (by Addshore):
Ignore namespace when matching Special:AboutTopic

https://gerrit.wikimedia.org/r/298723

Change 298724 had a related patch set uploaded (by Addshore):
Use x_analytics header to match special ns for Special:AboutTopic

https://gerrit.wikimedia.org/r/298724

Change 298725 had a related patch set uploaded (by Addshore):
Use x_analytics header to match special page name

https://gerrit.wikimedia.org/r/298725

Change 298726 had a related patch set uploaded (by Addshore):
Use webrequest in wikidata/articleplaceholder_metrics

https://gerrit.wikimedia.org/r/298726

Change 298717 merged by jenkins-bot:
Include the namespace for all pages

https://gerrit.wikimedia.org/r/298717

Change 298719 merged by jenkins-bot:
Include the resolved special page name for special pages

https://gerrit.wikimedia.org/r/298719

Change 298796 had a related patch set uploaded (by Addshore):
Include the namespace for all pages

https://gerrit.wikimedia.org/r/298796

Change 298797 had a related patch set uploaded (by Addshore):
Include the resolved special page name for special pages

https://gerrit.wikimedia.org/r/298797

Change 298796 merged by jenkins-bot:
Include the namespace for all pages

https://gerrit.wikimedia.org/r/298796

Change 298797 merged by jenkins-bot:
Include the resolved special page name for special pages

https://gerrit.wikimedia.org/r/298797

Mentioned in SAL [2016-07-13T17:44:12Z] <legoktm@tin> Synchronized php-1.28.0-wmf.10/extensions/WikimediaEvents/WikimediaEventsHooks.php: Include the namespace for all pages & Include the resolved special page name for special pages - T138500 (duration: 00m 36s)

Change 298723 abandoned by Addshore:
Ignore namespace when matching Special:AboutTopic

Reason:
This will be a one off run to backfill data and thus does not need to be merged.

https://gerrit.wikimedia.org/r/298723

Change 298725 abandoned by Addshore:
Use x_analytics header to match special page name

Reason:
Included as part of https://gerrit.wikimedia.org/r/#/c/298724/

https://gerrit.wikimedia.org/r/298725

Change 298723 restored by Addshore:
Ignore namespace when matching Special:AboutTopic

https://gerrit.wikimedia.org/r/298723

So the older data has now been filled into https://grafana.wikimedia.org/dashboard/db/article-placeholder
(overwriting the incorrect stuff)

The 2 patches now just need to be merged and deployed to have this running regularly!

Change 298723 abandoned by Addshore:
Ignore namespace when matching Special:AboutTopic

Reason:
This has now been run

https://gerrit.wikimedia.org/r/298723

Change 298724 merged by Nuria:
Match title & ns using x_analytics header & get all agent_types

https://gerrit.wikimedia.org/r/298724

Change 298726 merged by Nuria:
Use webrequest in wikidata/articleplaceholder_metrics

https://gerrit.wikimedia.org/r/298726

Change 300235 had a related patch set uploaded (by Addshore):
Query using webrequest_source in ArticlePlaceholder query

https://gerrit.wikimedia.org/r/300235

Change 300236 had a related patch set uploaded (by Addshore):
Fix pageview -> webrequest in articleplaceholder coordinator

https://gerrit.wikimedia.org/r/300236

Change 300235 merged by Joal:
Query using webrequest_source in ArticlePlaceholder query

https://gerrit.wikimedia.org/r/300235

Change 300248 had a related patch set uploaded (by Addshore):
Fix CastException in ArticlePlaceholderMetrics

https://gerrit.wikimedia.org/r/300248

Change 300248 merged by Joal:
Fix CastException in ArticlePlaceholderMetrics

https://gerrit.wikimedia.org/r/300248

Change 300236 merged by Joal:
Fix pageview -> webrequest in articleplaceholder coordinator

https://gerrit.wikimedia.org/r/300236

Addshore moved this task from Review to Done on the ArticlePlaceholder board.Jul 26 2016, 10:05 AM
Addshore moved this task from Doing to Done on the WMDE-Analytics-Engineering board.
Addshore moved this task from Active 🚁 to Closing ✔️ on the User-Addshore board.

@Addshore Is it possible to see which specific page titles are being rendered within Special:AboutTopic at each wiki? I couldn't see how to do so in the dashboard but might have missed it.

To repeat from T132223#2290840 a few people want "A way to see the most-viewed titles, so that the local community can focus their creation/translation efforts in the places where they might have the most usefulness."
E.g. if https://eo.wikipedia.org/wiki/Speciala%C4%B5o:AboutTopic/Q300915 was consistently getting the most views out of all uses of Special:AboutTopic, then local editors could focus on writing that article.

@Quiddity It is possible, but not with what has been done to fulfil this ticket.
It also doesn't sound like the sort of data we want to be feeding into graphite / grafana.

It would be nice if the pageview API could provide a list of these?