Page MenuHomePhabricator

Dashboard repository for limn-wikidata-data
Closed, ResolvedPublic

Description

We are looking at potentially using Limn to display some data thus requesting a repo.
This is slightly pre-emptive as we will not start working on this straight away but it would be good to have the repo ready!

Event Timeline

Addshore raised the priority of this task from to Needs Triage.
Addshore updated the task description. (Show Details)
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 14 2015, 10:39 AM
Addshore updated the task description. (Show Details)Sep 14 2015, 10:40 AM
Addshore set Security to None.
aude added a subscriber: aude.Sep 14 2015, 10:40 AM

where, what data?

Initially looking to replace the "Wikidata community engagement tracking" spreadsheet.

aude added a comment.Sep 14 2015, 10:47 AM

hmmm... no idea what that is but ok. replacing a spreadsheet sounds good :)

Indeed!
Some of the information can already be pulled from the dbs.
Other information will need scripts to run to collect and put in some tables in some other db on the analytics cluster.

What type of repo are we talking about here?

aude added a comment.Sep 14 2015, 12:23 PM

so, we make one of https://meta.wikimedia.org/wiki/Research:Data/Dashboards ? looks somewhat straightforward to do

Indeed.

We would need somewhere that runs scripts to pull data and store in some sql tables somewhere on the analytics cluster.
Then this repo would contain the stuff to read from those repos and make the limn graphs

https://wikitech.wikimedia.org/wiki/Analytics/Dashboards also describes how it can contain SQL queries that get executed regularly on stat1003.eqiad.wmnet . But it seems the output is only stored in csv files.

So not sure if we want to adapt that to also insert in an mysql table or make it only read from the mysql tables we generate from a different script.

If we want to display some of that in grafana, it seems fairly easy to also put the values into graphite: https://wikitech.wikimedia.org/wiki/Graphite#Record_data

aude added a comment.Sep 14 2015, 1:36 PM

@JanZerebecki that's one of the reasons I am asking. I would like to proceed with getting that done. :)

@JanZerebecki It would nice to consolidate the post-processed "ready to use" export data sets in a single directory that can be accessed by different presentation clients simply over http. Like here http://datasets.wikimedia.org/aggregate-datasets/{wikidata} for example.

I highly recommend enabling simple REST access to the data rather than having to use SQL to get values for analytics presentation.

Yes that would be what the csv/tsv files are for, I was in no way suggesting not to create them.

Well, after speaking with nuria in #wikimedia-analytics it looks like limn isn't really used any more apparently?
SO maybe we wont move forward with this....
bah, such confusion!

Nuria added a subscriber: Nuria.EditedSep 15 2015, 7:09 PM

Wait, one thing is limn, other (i know, confusing) the limn-data repositories.
those are not tied to limn necessarily, they are just poorly named.

Hi everyone. I am sorry for the confusion. I'll try to clarify a bit:

limn-<<something>>-data repositories: this is good, please create this if you can or ask me and I'll create it for you. This is where your data-fetching scripts and any configuration files should go

limn: this is a deprecated dashboarding tool, but I'm happy to set up a dashboard with it if limn meets your needs exactly. We're just not developing new features for limn

dashiki: this is a much simpler much easier to work with dashboarding tool. I already spoke with @aude a bit, and I'm happy to help whoever's working on dashboarding with this.

In both cases, the documentation is not enough for you to do everything by yourself. We can talk in IRC or here.

Addshore closed this task as Resolved.Sep 15 2015, 7:39 PM
Addshore claimed this task.

https://gerrit.wikimedia.org/r/#/admin/projects/analytics/limn-wikidata-data has been created and wikidata has access to the group.

I'll start trying to use this tommorrow

FYI: I am working on the dashboards and have made some progress using the shiny-server.
Check out the very preliminary prototype at http://wdm.wmflabs.org/wdm
The repo is here https://git.wikimedia.org/tree/wikidata%2Fanalytics%2Fdashboard/HEAD
@Addshore if you can generate/export the Magnus and Lydia tsv data (for starters) as they are being used (examples in the /data directory) and put them in an accessible place , this would be extremely helpful.

Addshore added a comment.EditedSep 16 2015, 11:16 AM

After further discussion with @Milimetric I think Dashiki might be the way forward with some / all of this.
And yet again I may now not end up trying to use limn

@Christopher I guess if I start spitting some automated tsvs out of some of the stuff that we want to track you could just point shiny at them?

@Addshore yes, I will just create a separate remote download set function and point it at your sources so that we can use both local and remote data.

One thing to note (particular to Dygraphs) is that the order of the columns in the datasource matters. The first column is the one that appears first in the legend (or on the bottom for stacked). So, if we are shooting for consistency with the Magnus graphs, this is worth considering in your scripts.