Page MenuHomePhabricator

Wikistats 2.0.
Open, MediumPublic0 Story Points

Description

Setting up a pipeline to source Historical Edit Data into hdfs, aggregate it and expose it externally and power with it the new Wikistats UI

Related Objects

StatusAssignedTask
Resolvedmforns
ResolvedNone
ResolvedNone
OpenNone
ResolvedNone
OpenNone
ResolvedNone
DuplicateNone
Invalidmforns
ResolvedJAllemandou
Duplicatemforns
ResolvedOttomata
ResolvedOttomata
DeclinedJAllemandou
DuplicateNone
DuplicateNone
ResolvedNuria
Resolvedmforns
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
Resolvedmforns
DuplicateMilimetric
Resolvedmforns
Resolvedmforns
ResolvedNone
ResolvedNone
Resolvedashgrigas
Resolvedashgrigas
ResolvedMilimetric
ResolvedMilimetric
ResolvedMilimetric
ResolvedMilimetric
Resolvedmforns
ResolvedNone
ResolvedNone
Resolvedmforns
Resolvedmforns
ResolvedMilimetric
ResolvedMilimetric
ResolvedJAllemandou
ResolvedMilimetric
ResolvedJAllemandou
DeclinedNone
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedOttomata
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
DeclinedNone
ResolvedNone
Resolvedfdans
ResolvedNone
DuplicateNone
DuplicateNone
ResolvedMilimetric
ResolvedMilimetric
ResolvedMilimetric
Resolvedfdans
Resolvedfdans
Resolvedfdans
Resolvedfdans
ResolvedMilimetric
ResolvedMilimetric
Resolvedmmodell
ResolvedMilimetric
ResolvedJAllemandou
ResolvedMilimetric
ResolvedMilimetric
DuplicateJAllemandou
DeclinedJAllemandou
Resolvedelukey
ResolvedOttomata
DuplicateJAllemandou
Resolvedfdans
ResolvedMilimetric
Resolvedmforns
Resolvedfdans
Resolvedmforns
Resolvedfdans
ResolvedNone
OpenNone
ResolvedJAllemandou
Resolvedfdans
Resolvedfdans
Resolvedfdans
Duplicatefdans
Resolvedfdans
Resolvedfdans
ResolvedNone
Resolvedfdans
Resolvedfdans
Resolvedfdans
DeclinedNone
Resolvedfdans
Resolvedfdans
Resolvedfdans
OpenNone
OpenNone
ResolvedNone
ResolvedMilimetric
ResolvedJAllemandou
ResolvedNone
Resolvedema
ResolvedJAllemandou
Resolved Tbayer
Resolvedfdans
Resolvedfdans
Resolvedfdans
DuplicateNone
Resolvedfdans
OpenNone
OpenNone
Resolvedfdans
ResolvedJAllemandou
OpenNone
Resolvedfdans
OpenJAllemandou
ResolvedNuria
Resolvedfdans
ResolvedJAllemandou
DuplicateJAllemandou
Resolvedfdans
DuplicateNone
ResolvedAmitjoki
DuplicateErik_Zachte
ResolvedNone
OpenNone
OpenNone
DuplicateNone
DuplicateNone
OpenNone

Event Timeline

Nuria created this task.Mar 17 2016, 8:06 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 17 2016, 8:06 PM
Nuria renamed this task from Wikistats 2.0. Edit Reports to Wikistats 2.0. Edit Reports: Source Historical Edit Data into hdfs.Mar 17 2016, 8:07 PM
Nuria renamed this task from Wikistats 2.0. Edit Reports: Source Historical Edit Data into hdfs to Wikistats 2.0. Edit Reports: Source Historical Edit Data into hdfs {lama}.
Milimetric triaged this task as Medium priority.Mar 21 2016, 4:10 PM
Milimetric moved this task from Incoming to Backlog (Later) on the Analytics board.
Milimetric set the point value for this task to 0.
Nuria renamed this task from Wikistats 2.0. Edit Reports: Source Historical Edit Data into hdfs {lama} to Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama}.Mar 22 2016, 5:23 PM
Nuria updated the task description. (Show Details)
Nuria added a comment.Apr 6 2016, 6:20 PM

Code name for the bigger task of data gathering: "data lake" : https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake

Nuria added a comment.EditedMay 12 2016, 4:59 PM

Please look at substasks to see our backlog regarding Wikistats 2.0 replacement.

We are working towards reconstructing edit history without having to depend on the dumps.

JAllemandou moved this task from Next Up to Parent Tasks on the Analytics-Kanban board.
Akeron added a subscriber: Akeron.May 27 2016, 3:22 PM
Nuria updated the task description. (Show Details)Jul 27 2016, 8:21 PM
Arrbee added a subscriber: Arrbee.Sep 22 2016, 9:56 AM
Nuria renamed this task from Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama} to Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data .Nov 30 2016, 7:47 PM
Nuria updated the task description. (Show Details)Dec 21 2016, 11:48 PM
Nuria updated the task description. (Show Details)
Nuria renamed this task from Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data to Wikistats 2.0..Mar 13 2017, 7:16 PM
Nuria updated the task description. (Show Details)
Milimetric changed the status of subtask T145091: Redact data so it can be public from Resolved to Declined.Mar 22 2017, 5:19 PM

@Nuria @Milimetric: I apologize if this is a bad place for this feedback, but I couldn't think of a better one.

I had a meeting with @HaeB, @Tnegrin, and @mpopov yesterday where we were discussed our metrics reporting to the Board of Trustees. We felt that Wikistats 2.0 would be a very valuable addition to this process, as a general destination for movement-level metrics, and that it would be even better if it had the ability to display generally-useful data annotations. For example, it would be helpful to users to explain the big pageviews drop that occurred when we switched to the new pageviews infrastructure, or the big drop in new active editors when we turned on anonymous mobile editing, or the big drop in edits when we switched to Wikidata for interlanguage links.

Have you given any thought to supporting such annotations?

Nuria added a comment.Mar 24 2017, 4:08 AM

Have you given any thought to supporting such annotations?

Yes, Dashiki already supports annotations (it is been a while), see pageviews and look at bottom axis
https://analytics.wikimedia.org/dashboards/vital-signs/#projects=eswiki,itwiki,enwiki,jawiki,dewiki,ruwiki,frwiki/metrics=Pageviews

Like any configuraion in dashiki this info is sourced from meta: https://meta.wikimedia.org/wiki/Dashiki:PageviewsAnnotations

Have in mind that wikistats has a strong community focus, its existance much predates the foundation and, really, its main goal is to motivate our editor community (cc @Erik_Zachte)

I think for reporting data to the board the new incarnation of the reportcard might be a much bette venue.

You can see a design prototype for the new Wikistats here (much WIP): https://analytics-prototype.wmflabs.org/#/ , We will be requesting a second round of community feedback on this visual design this coming moth.

Ooh, that prototype is still not ready to be shared, it's still very much early days. That said, I have thoughts on annotations.

Dashiki primarily uses dygraphs for timeseries line graphs. Work on dygraphs has pretty much ground to a halt and it has very basic annotation support. That's why the annotations in dashiki kind of suck. We have two thoughts for making annotations better in wikistats 2.0:

  1. build simple line graphs ourselves from d3, because d3 is now modular and we don't have to take a 400kb hit every time we load a graph. This should make it mobile-friendly and let us finally have decent custom annotations.
  2. find a better way to store annotations than the rough wiki-page raw JSON way we've been doing it so far. I'm thinking either nicely rendering Config:Dashiki:Annotations: pages or adding an API where you can request annotations. I think @Nuria does not like this second approach so we still have a little architecture work there. But in general we all agree annotations should be easier to add/update/consume.

With those two improvements we want ultimately to put annotations in the hands of the analysts at the foundation. So while Wikistats is community-focused, the people working on this data, and the annotations will be us. So we definitely want to make that easier/better.

Finally, I think @ezachte is a better ping for Erik than @Erik_Zachte, right?

Have in mind that wikistats has a strong community focus, its existance much predates the foundation and, really, its main goal is to motivate our editor community (cc @Erik_Zachte)
I think for reporting data to the board the new incarnation of the reportcard might be a much bette venue.

In this case, it seems like the board wants very similar information to any journalist or Metapedian: mainly global editing and traffic numbers, with a side of whatever else might be illuminating about the state of the movement (no doubt they also want information on what different teams at the WMF are doing, but that's not the kind of reporting @HaeB and I have been doing for them).

I was not aware that a new incarnation of the report card is planned; perhaps you could give me some details? But I don't see the argument for adding complexity by having two dashboards where one one would do.

With those two improvements we want ultimately to put annotations in the hands of the analysts at the foundation. So while Wikistats is community-focused, the people working on this data, and the annotations will be us. So we definitely want to make that easier/better.

This is very exciting! There's a fair amount of knowledge of metric fluctuations rattling around in my brain, and right now there's no good place to document it. It sounds like annotations on Wikistats 2.0 could be that place :D

Nuria added a comment.Mar 24 2017, 7:59 PM

I was not aware that a new incarnation of the report card is planned; perhaps you could give me some details?

See: https://phabricator.wikimedia.org/T130117

For our first stab we are just moving it to dashiki and reporting a few metrics, the current reportcard (which hasn't been updated in a while) will redirect to http://analytics.wikimedia.org/dashboards/report-card/ (or similar) . This part of the work we have been doing to completely deprecate limn, we started with the editor dashboards you are familiar with. Migrating the UI is easy as you know, most of the work is been dedicated to have a programatic way to retrieve pageviews older than 2015. All this work is organized this task: https://phabricator.wikimedia.org/T146308

Nuria edited projects, added Analytics; removed Analytics-Kanban.Mar 8 2018, 6:38 PM
Nuria moved this task from Backlog (Later) to Incoming on the Analytics board.
Nuria edited projects, added Analytics; removed Analytics-Kanban.
Nuria moved this task from Backlog (Later) to Incoming on the Analytics board.
Nuria edited projects, added Analytics-Kanban; removed Analytics.