Page MenuHomePhabricator

Wikistats 2.0.
Closed, ResolvedPublic0 Estimated Story Points

Description

Setting up a pipeline to source Historical Edit Data into hdfs, aggregate it and expose it externally and power with it the new Wikistats UI

Related Objects

StatusSubtypeAssignedTask
Resolvedmforns
ResolvedNone
ResolvedNone
Resolvedodimitrijevic
ResolvedNone
Resolvedmforns
ResolvedNone
DuplicateNone
Invalidmforns
ResolvedJAllemandou
Duplicatemforns
ResolvedOttomata
ResolvedOttomata
DeclinedJAllemandou
DuplicateNone
DuplicateNone
ResolvedNuria
Resolvedmforns
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
Resolvedmforns
DuplicateMilimetric
Resolvedmforns
Resolvedmforns
ResolvedNone
ResolvedNone
Resolvedashgrigas
Resolvedashgrigas
ResolvedMilimetric
ResolvedMilimetric
ResolvedMilimetric
ResolvedMilimetric
Resolvedmforns
ResolvedNone
ResolvedNone
Resolvedmforns
Resolvedmforns
ResolvedMilimetric
ResolvedMilimetric
ResolvedJAllemandou
ResolvedMilimetric
ResolvedJAllemandou
DeclinedNone
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedOttomata
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
ResolvedJAllemandou
DeclinedNone
ResolvedNone
Resolved fdans
ResolvedNone
DuplicateNone
DuplicateNone
ResolvedMilimetric
ResolvedMilimetric
ResolvedMilimetric
Resolved fdans
Resolved fdans
Resolved fdans
Resolved fdans
ResolvedMilimetric
ResolvedMilimetric
Resolvedmmodell
ResolvedMilimetric
ResolvedJAllemandou
ResolvedMilimetric
ResolvedMilimetric
DuplicateJAllemandou
DeclinedJAllemandou
Resolvedelukey
ResolvedOttomata
DuplicateJAllemandou
Resolved fdans
ResolvedMilimetric
Resolvedmforns
Resolved fdans
Resolvedmforns
Resolved fdans
ResolvedNone
DeclinedNone
ResolvedJAllemandou
Resolved fdans
Resolved fdans
Resolved fdans
Duplicate fdans
Resolved fdans
Resolved fdans
ResolvedNone
Resolved fdans
Resolved fdans
Resolved fdans
DeclinedNone
Resolved fdans
Resolved fdans
Resolved fdans
Resolvedmforns
ResolvedAmitjoki
DuplicateErik_Zachte
ResolvedNone
ResolvedMilimetric
ResolvedJAllemandou
ResolvedNone
Resolved ema
ResolvedJAllemandou
Resolved Tbayer
Resolved fdans
Resolved fdans
Resolved fdans
DuplicateNone
Resolved fdans
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedJAllemandou
ResolvedNone
Resolved fdans
Resolvedmforns
Resolvedmforns
Resolved fdans
DuplicateNone
DuplicateNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Nuria renamed this task from Wikistats 2.0. Edit Reports to Wikistats 2.0. Edit Reports: Source Historical Edit Data into hdfs.Mar 17 2016, 8:07 PM
Nuria renamed this task from Wikistats 2.0. Edit Reports: Source Historical Edit Data into hdfs to Wikistats 2.0. Edit Reports: Source Historical Edit Data into hdfs {lama}.
Milimetric triaged this task as Medium priority.Mar 21 2016, 4:10 PM
Milimetric moved this task from Incoming to Backlog (Later) on the Analytics board.
Milimetric set the point value for this task to 0.
Nuria renamed this task from Wikistats 2.0. Edit Reports: Source Historical Edit Data into hdfs {lama} to Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama}.Mar 22 2016, 5:23 PM
Nuria updated the task description. (Show Details)

Please look at substasks to see our backlog regarding Wikistats 2.0 replacement.

We are working towards reconstructing edit history without having to depend on the dumps.

Nuria renamed this task from Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data into hdfs {lama} to Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data .Nov 30 2016, 7:47 PM
Nuria updated the task description. (Show Details)
Nuria renamed this task from Wikistats 2.0. Edit Reports: Setting up a pipeline to source Historical Edit Data to Wikistats 2.0..Mar 13 2017, 7:16 PM
Nuria updated the task description. (Show Details)

@Nuria @Milimetric: I apologize if this is a bad place for this feedback, but I couldn't think of a better one.

I had a meeting with @HaeB, @Tnegrin, and @mpopov yesterday where we were discussed our metrics reporting to the Board of Trustees. We felt that Wikistats 2.0 would be a very valuable addition to this process, as a general destination for movement-level metrics, and that it would be even better if it had the ability to display generally-useful data annotations. For example, it would be helpful to users to explain the big pageviews drop that occurred when we switched to the new pageviews infrastructure, or the big drop in new active editors when we turned on anonymous mobile editing, or the big drop in edits when we switched to Wikidata for interlanguage links.

Have you given any thought to supporting such annotations?

Have you given any thought to supporting such annotations?

Yes, Dashiki already supports annotations (it is been a while), see pageviews and look at bottom axis
https://analytics.wikimedia.org/dashboards/vital-signs/#projects=eswiki,itwiki,enwiki,jawiki,dewiki,ruwiki,frwiki/metrics=Pageviews

Like any configuraion in dashiki this info is sourced from meta: https://meta.wikimedia.org/wiki/Dashiki:PageviewsAnnotations

Have in mind that wikistats has a strong community focus, its existance much predates the foundation and, really, its main goal is to motivate our editor community (cc @Erik_Zachte)

I think for reporting data to the board the new incarnation of the reportcard might be a much bette venue.

You can see a design prototype for the new Wikistats here (much WIP): https://analytics-prototype.wmflabs.org/#/ , We will be requesting a second round of community feedback on this visual design this coming moth.

Ooh, that prototype is still not ready to be shared, it's still very much early days. That said, I have thoughts on annotations.

Dashiki primarily uses dygraphs for timeseries line graphs. Work on dygraphs has pretty much ground to a halt and it has very basic annotation support. That's why the annotations in dashiki kind of suck. We have two thoughts for making annotations better in wikistats 2.0:

  1. build simple line graphs ourselves from d3, because d3 is now modular and we don't have to take a 400kb hit every time we load a graph. This should make it mobile-friendly and let us finally have decent custom annotations.
  2. find a better way to store annotations than the rough wiki-page raw JSON way we've been doing it so far. I'm thinking either nicely rendering Config:Dashiki:Annotations: pages or adding an API where you can request annotations. I think @Nuria does not like this second approach so we still have a little architecture work there. But in general we all agree annotations should be easier to add/update/consume.

With those two improvements we want ultimately to put annotations in the hands of the analysts at the foundation. So while Wikistats is community-focused, the people working on this data, and the annotations will be us. So we definitely want to make that easier/better.

Finally, I think @ezachte is a better ping for Erik than @Erik_Zachte, right?

Have in mind that wikistats has a strong community focus, its existance much predates the foundation and, really, its main goal is to motivate our editor community (cc @Erik_Zachte)

I think for reporting data to the board the new incarnation of the reportcard might be a much bette venue.

In this case, it seems like the board wants very similar information to any journalist or Metapedian: mainly global editing and traffic numbers, with a side of whatever else might be illuminating about the state of the movement (no doubt they also want information on what different teams at the WMF are doing, but that's not the kind of reporting @HaeB and I have been doing for them).

I was not aware that a new incarnation of the report card is planned; perhaps you could give me some details? But I don't see the argument for adding complexity by having two dashboards where one one would do.

With those two improvements we want ultimately to put annotations in the hands of the analysts at the foundation. So while Wikistats is community-focused, the people working on this data, and the annotations will be us. So we definitely want to make that easier/better.

This is very exciting! There's a fair amount of knowledge of metric fluctuations rattling around in my brain, and right now there's no good place to document it. It sounds like annotations on Wikistats 2.0 could be that place :D

I was not aware that a new incarnation of the report card is planned; perhaps you could give me some details?

See: https://phabricator.wikimedia.org/T130117

For our first stab we are just moving it to dashiki and reporting a few metrics, the current reportcard (which hasn't been updated in a while) will redirect to http://analytics.wikimedia.org/dashboards/report-card/ (or similar) . This part of the work we have been doing to completely deprecate limn, we started with the editor dashboards you are familiar with. Migrating the UI is easy as you know, most of the work is been dedicated to have a programatic way to retrieve pageviews older than 2015. All this work is organized this task: https://phabricator.wikimedia.org/T146308

Nuria moved this task from Backlog (Later) to Incoming on the Analytics board.
Nuria edited projects, added Analytics; removed Analytics-Kanban.
Nuria moved this task from Backlog (Later) to Incoming on the Analytics board.
Nuria edited projects, added Analytics-Kanban; removed Analytics.
odimitrijevic claimed this task.
odimitrijevic added a subscriber: odimitrijevic.

Closing as a parent task in favor of using project tags. Epic tasks can serve as parent tasks when needed to capture large feature work.