Page MenuHomePhabricator

Provide the Wikimedia DE folks with Hive access/training {flea} [8 pts]
Closed, ResolvedPublic

Description

Lydia and team need Hive access to support a new projects. Please provide them with access and if possible training.

Thanks!

-Toby

Expanding this a little bit: Edward Galvez needs to query for pages in the Grant namespace on meta. We can add a simple example as part of this documentation.

Event Timeline

Tnegrin created this task.Jul 16 2015, 4:10 PM
Tnegrin assigned this task to Ottomata.
Tnegrin raised the priority of this task from to Needs Triage.
Tnegrin updated the task description. (Show Details)
Tnegrin added a project: Analytics-Backlog.
Tnegrin added subscribers: Tnegrin, Lydia_Pintscher, daniel, hoo.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 16 2015, 4:10 PM
ggellerman triaged this task as Normal priority.Jul 24 2015, 4:23 PM
ggellerman set Security to None.
ggellerman moved this task from Incoming to Low on the Analytics-Backlog board.
ggellerman moved this task from Low to Medium on the Analytics-Backlog board.
ggellerman renamed this task from Provide the Wikimedia DE folks with Hive access/training to Provide the Wikimedia DE folks with Hive access/training {flea}.Aug 4 2015, 10:35 PM
kevinator raised the priority of this task from Normal to High.Aug 7 2015, 4:17 PM
kevinator moved this task from Medium to Prioritized on the Analytics-Backlog board.
Milimetric renamed this task from Provide the Wikimedia DE folks with Hive access/training {flea} to Provide the Wikimedia DE folks with Hive access/training {flea} [8 pts].Aug 10 2015, 4:34 PM
kevinator added a subscriber: kevinator.EditedAug 10 2015, 4:35 PM

We'll create some documentation. Some notes on the documentation:

  • section on getting access, with pre-requisites & links (shell account, ssh configuration to proxy through bastion)
  • example queries
  • using Hue
  • keeping queries optimal (use where clause to limit partitions scanned)
  • short section on data & security
    • follow data retention guidelines
    • keep private data private
    • don't publish someone else's data
Milimetric updated the task description. (Show Details)Aug 27 2015, 9:13 PM
Nuria claimed this task.Sep 1 2015, 4:09 PM
Milimetric added a subscriber: DFoy.Sep 1 2015, 4:15 PM
Nuria added a comment.Sep 2 2015, 7:24 PM

We have this existing doc on Hive:
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive/Queries

That includes how to get access and some sample queries (those could be updated):

https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive/Queries#Cluster_Access

ggellerman moved this task from In Progress to Done on the Analytics-Kanban board.Sep 16 2015, 3:44 PM
kevinator closed this task as Resolved.Sep 18 2015, 11:13 PM