Page MenuHomePhabricator

Build an internal Quarry instance to share data & sample queries between researchers (and other analytics users?)
Closed, DeclinedPublic

Description

Yuvi's description of the task:
If you guys are worried about this taking too much of analytics' time, here is a proposal:

  1. Someone authorizes the machine
  2. I set myself a 2 week deadline, working solely on 'volunteer time' (all of Quarry and a lot of other things were built in a similar way), setting it up and securing it there. It will be super protected (hehe) at this point, with very limited access and no publicity.
  3. The analytics team plays with this, and then decides what to do.
  4. I will continue to maintain Quarry anyway - we're close to reaching 1000 individual queries run, and that's with very limited publicity! All code will be labs/prod agnostic as well. Plus it's written with similar frameworks to other projects analytics is already doing (flask, celery), so not too much of a tech jump there if you guys decide to add more features.

Version: unspecified
Severity: enhancement

Details

Reference
bz73142

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:57 AM
bzimport added a project: Quarry.
bzimport set Reference to bz73142.
bzimport added a subscriber: Unknown Object (MLST).

Why do we need an internal WMF-only instance of Quarry?

Labsdb is redacted but internal analytics store isn't.

So you're going to provide all WMF employees (and presumably contractors etc. as well?) with access to all the research data via some sort of corp Quarry instance?

Such a thing has been proposed, primarily for eventlogging information. It would be for anyone with enough clearance to access it. Currently nobody is spending time nor energy in building such a thing, AFAIK. I do believe it will be quite useful however

So actually it wouldn't be just for WMF employees, it would actually be something in production that some of the groups defined in puppet get access to (via SSH proxying I guess) instead of (or in addition to?) providing direct MySQL access to the DBs?

There already exists such a group (researchers I think?) so this might piggy back off it. Or not - we don't know. Will be determined if / when someone starts working on this I guess :)

That was the group I had in mind, I know for a fact it's not 100% WMF employees, but there's a bunch of different analytics groups and I wasn't sure if that was the right one. Will adjust the title.

Krenair renamed this task from WMF employees use an internal Quarry instance to share data & sample queries to Build an internal Quarry instance to share data & sample queries between researchers (and other analytics users?).Jul 4 2015, 8:52 PM
Krenair set Security to None.

@yuvipanda have you changed your mind about this in favor of jupyter goodness?