Page MenuHomePhabricator

adding tjones to analytics-privatedata-users (hive and webrequests)
Closed, ResolvedPublic

Description

I'd like to have access to webrequests data in hive in order to analyze search traffic, pageviews, and various data related to search.

My shell account is tjones.

I have access to stat1002 but when I want to run a hive query it fails with this error:

hive (wmf)> SELECT
          >   concat(month,'/',day,'/',year), access_method, sum(view_count)
          > FROM 
          >   wmf.pageview_hourly
          > WHERE
          >   year = 2015
          >   AND month = 8
          >   AND agent_type = "user"
          >   AND country = "Canada"
          >   AND project = "en.wikipedia"
          > GROUP BY
          >   year, month, day, access_method;
Query ID = tjones_20151019161212_618eb127-1c8a-43e4-9291-8863067aea0d
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1009
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
org.apache.hadoop.security.AccessControlException: Permission denied: user=tjones, access=WRITE, inode="/user":hdfs:hadoop:drwxrwxr-x
[...etc.]

Event Timeline

TJones raised the priority of this task from to Needs Triage.
TJones updated the task description. (Show Details)
TJones added subscribers: TJones, Tfinc.

I believe tjones needs to be added to analytics-privatedata-users group in puppet, but IIRC otto mentioned there is something in addition that needs to be done so hive works properly.

Change 247455 had a related patch set uploaded (by Dzahn):
admin: add tjones to analytics-privatedata-users

https://gerrit.wikimedia.org/r/247455

Dzahn triaged this task as Medium priority.Oct 19 2015, 8:41 PM

IIRC otto mentioned there is something in addition that needs to be done so hive works properly.

@Ottomata ^ what was the other thing? is the group correct?

I went back and looked through the patches for adding me to hive, from T109356. Based on the history of patches there there should be nothing else to do, The 'other thing' was just using the correct group. I was mistakenly added to statistics-privatedata-users and the fix was to move me to analytics-privatedata-users instead.

@EBernhardson Thank you for checking that. Ok, this was also confirmed by Otto on the gerrit change.

Everything looks good here, we just have to wait now for this to merge because of the "3-day waiting" rule.

Change 247455 merged by Dzahn:
admin: add tjones to analytics-privatedata-users

https://gerrit.wikimedia.org/r/247455

@TJones this is done now.

on stat1002:
[stat1002:~] $ id tjones
uid=12510(tjones) gid=500(wikidev) groups=500(wikidev),725(statistics-privatedata-users),731(analytics-privatedata-users)

Dzahn removed a project: Patch-For-Review.
Dzahn set Security to None.