Page MenuHomePhabricator

{lama} Wikistats traffic reports 2.0
Closed, ResolvedPublic

Description

We have to transition wikistats reports to be based on more reliable data. We will use this epic to discuss the value of the reports and the priority order in which they should be transitioned. Please add comments below and I (@Milimetric) will edit the description accordingly.

NOTES:

Useful as a source for the eventual shape of this project: https://www.mediawiki.org/wiki/Analytics/Wikistats/TrafficReports/Future_per_report_B2
This design from Pau will be used for the interface of Wikistats 2.0: T92502
Dashiki will most likely be used to build the UI for Wikistats 2.0 (unless a better framework is available by the time we get coding)

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
DeclinedNone
DeclinedNone
DeclinedNone
ResolvedMilimetric
ResolvedNone
ResolvedMilimetric
DeclinedNone
Resolvedmforns
Resolvedmforns
Resolvedmforns
Resolvedmforns
Resolved Nuria
ResolvedMilimetric
Resolvedmforns
ResolvedKrinkle
DeclinedNone
ResolvedNone
DeclinedNone
OpenNone

Event Timeline

Milimetric raised the priority of this task from to Medium.
Milimetric updated the task description. (Show Details)
Milimetric added a project: Analytics-Kanban.
Milimetric subscribed.
Krinkle set Security to None.
Krinkle added subscribers: Nemo_bis, Krinkle.

@Milimetric, is this task mainly about the generation of raw statistics (that is, the kind stored at http://datasets.wikimedia.org), or about a consolidated dashboard for useful metrics, or both?

@Neil_P._Quinn_WMF, and others:

This task is a project epic task that we use as a place-holder to both remember what {lama} means and keep notes about the project. So discussions about all things related to Wikistats 2.0 belong here, and the description of the task will change as that project takes shape. We're starting very early on this, as we probably won't start coding until Q2.

For content Translation (ContentTranslation), it has been useful to check the "New articles per day" chart for a given language in order to get an idea of how the production of new articles in a given community (and compare them with the number of translations they create). The comparison was not immediate since we had to extrapolate the value to "new pages per week" or "new pages per month", but it was a useful data point.

By the way, Pau commenting reminded me that I should mention his Dashboard Directory work. So Pau designed this awesome UX for how people could find different dashboards: T92502. We are definitely going to use his work when we build Wikistats 2.0

I'm confused by the mention of new articles. Are we expected to mention *all* the metrics and plots we find useful in stats.wikimedia.org? Even from the standard tables like https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaMEDIAWIKI.htm ?

Hmm, I would recommend keeping Wikistats dump based stats and Wikistats traffic stats totally separate.
Otherwise scope will be overwhelming and confusion amounts. The projects are nearly 100% separate, have different sources, audiences, etc (People started to name all of these Wikistats mainly because they were published on one site.)

In fact I would welcome renaming Wikistats Traffic Reports into WikiTraffic or similar, and call the successor WikiTraffic 2.0

See also https://www.mediawiki.org/wiki/Analytics/Wikistats/TrafficReports/Future

Ah, if this is about traffic reports then things become much clearer. Rewriting WikiStats from scratch seems... very unlikely. Milimetric, if that's what you meant please change summary. :)

Hold please, clarifying with Erik separately :)

In any case, any new work on this kind of thing will be fully modular and we wouldn't combine traffic reports with dumps analysis in any way other than joining them in an easy to use new front-end. I'll be back when I know more.

Here's another use case I came across recently:

As a legal team member, 
I want traffic reports by country 
so I can provide evidence of traffic in a particular country.

Presently, they use:
http://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerCountryOverview.htm

It seems Pau's interface (T92502) mostly focuses on highly aggregated information for WMF dashboards.

There is also a need for continuation of (some) of the current much more detailed reports.
I discussed earlier with @Milimetric how we could migrate some existing reports relatively quickly to a new hive based data stream.
But complete overhaul could work equally well, if resources are available.

Here is a page to collect detailed feedback per current report (please chime in):
https://www.mediawiki.org/wiki/Analytics/Wikistats/TrafficReports/Future_per_report_B2

Pau's original task was definitely more restricted than "all of the data" :)

@violetto is working with us now to extend Pau's designs to the broader scope. Thanks for the wiki page @ezachte, that's going to come in handy.

FYI - we are starting to prioritize this work for this quarter. We've made tasks for each of the reports listed on the wiki page [1]. These tasks can be found in the Analytics-Backlog for now, their names are all tagged with {lama}. We did *not* create tasks for the reports that were not voted for:

If you'd like to vote for these after this, please let me know and make sure I acknowledge your request. It's looking like a lot of work for what's going to be a short quarter, we'll do our best :)

[1] https://www.mediawiki.org/wiki/Analytics/Wikistats/TrafficReports/Future_per_report_B2

Aggregate User Agent information (@Krinkle). Very useful for making engineering decisions on features and performance work.

Adding T88504 as blocker for this.

@Nuria, Why did you remove T88504 as a blocker to this task? There's no commentary here or there. As this is probably the most vital statistical (non-load) information for engineering decisions, it's worth tracking.

@Jdforrester-WMF: sorry, I did not see your comment earlier. The item we closed was getting data from cluster in the form of sensible reports. While there is no viz yet the data is available and calculated weekly:
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/BrowserReports

I will add https://phabricator.wikimedia.org/T118329 here as it is a better task for what you are interested in.

In T107175#1500563, I wrote:

Ah, if this is about traffic reports then things become much clearer.

About time we do.

P.s.: And removing myself too. I'm not especially interested in traffic reports, I was here only for the main wikistats.

Nemo_bis renamed this task from {lama} Wikistats 2.0 to {lama} Wikistats traffic reports 2.0.Mar 24 2016, 4:09 PM
Nemo_bis updated the task description. (Show Details)
Nemo_bis added a subscriber: Krinkle.
Nemo_bis removed a subscriber: Krinkle.

Hm, I don't really agree with https://phabricator.wikimedia.org/T107175#2148129, I would've changed the title back, but it looks like we're managing the other work somewhere else, so it's all good.

It's all gonna get done one way or another, and it'll be called Wikistats 2.0 and it'll have all the reports to make everyone happy.