Page MenuHomePhabricator

Update the Camus checker to be able to authenticate via Kerberos
Closed, ResolvedPublic5 Estimated Story Points

Description

2019-06-21 06:59:52 ERROR CamusPartitionChecker$:328 - A fatal error occurred during execution.
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
	at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1960)
	at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1941)
	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:693)
	at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
	at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
	at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1485)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1525)
	at org.wikimedia.analytics.refinery.camus.CamusStatusReader.mostRecentRuns(CamusStatusReader.scala:76)
	at org.wikimedia.analytics.refinery.camus.CamusPartitionChecker$.main(CamusPartitionChecker.scala:300)
	at org.wikimedia.analytics.refinery.camus.CamusPartitionChecker.main(CamusPartitionChecker.scala)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
	at org.apache.hadoop.ipc.Client.call(Client.java:1470)
	at org.apache.hadoop.ipc.Client.call(Client.java:1401)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
	at com.sun.proxy.$Proxy9.getListing(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:554)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at com.sun.proxy.$Proxy10.getListing(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1958)
	... 12 more

The camus checker seems to be using Hadoop HDFS apis directly, and those will probably need to support Kerberos.

Event Timeline

Joseph seems to have found a solution to this problem, namely appending :/etc/hadoop/conf to the -cp path of the call to the camus checker in our camus python script. Why it works is a bit of a mistery :D

Change 518961 had a related patch set uploaded (by Elukey; owner: Elukey):
[analytics/refinery@master] camus: add hadoop config path to the checker's java cp parameter

https://gerrit.wikimedia.org/r/518961

Change 518961 merged by Elukey:
[analytics/refinery@master] camus: add hadoop config path to the checker's java cp parameter

https://gerrit.wikimedia.org/r/518961

elukey triaged this task as Medium priority.Jun 27 2019, 9:09 AM
elukey added a project: Analytics-Kanban.
elukey set the point value for this task to 5.

Systemd timers now support kerberos auth, and Joseph's change for the checker seems to work perfectly. No more actions left!