root@clouddumps1002:/usr# cat lib/systemd/system/analytics-dumps-fetch-pageview_complete_dumps.service [Unit] Description=Copy pageview_complete_dumps files from Hadoop HDFS. [Service] User=dumpsgen SyslogIdentifier=kerberos-run-command ExecStart=/usr/local/bin/systemd-timer-mail-wrapper -T data-engineering-alerts@lists.wikimedia.org --only-on-error /usr/local/bin/kerberos-run-command dumpsgen /usr/local/bin/rsync-analytics-pageview_complete_dumps
Description
Description
Related Objects
Related Objects
Event Timeline
Comment Actions
root@clouddumps1002:/usr# systemctl status analytics-dumps-fetch-pageview_complete_dumps.service ● analytics-dumps-fetch-pageview_complete_dumps.service - Copy pageview_complete_dumps files from Hadoop HDFS. Loaded: loaded (/lib/systemd/system/analytics-dumps-fetch-pageview_complete_dumps.service; static) Active: failed (Result: exit-code) since Thu 2022-11-03 05:00:40 UTC; 8h ago TriggeredBy: ● analytics-dumps-fetch-pageview_complete_dumps.timer Process: 296219 ExecStart=/usr/local/bin/systemd-timer-mail-wrapper -T data-engineering-alerts@lists.wikimedia.org --only-on-error /usr/local/bin/kerberos-run-command dumpsgen> Main PID: 296219 (code=exited, status=1/FAILURE) CPU: 57.019s Nov 03 05:00:40 clouddumps1002 kerberos-run-command[296219]: at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:166) Nov 03 05:00:40 clouddumps1002 kerberos-run-command[296219]: at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:158) Nov 03 05:00:40 clouddumps1002 kerberos-run-command[296219]: at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:96) Nov 03 05:00:40 clouddumps1002 kerberos-run-command[296219]: at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:362) Nov 03 05:00:40 clouddumps1002 kerberos-run-command[296219]: at com.sun.proxy.$Proxy11.getListing(Unknown Source) Nov 03 05:00:40 clouddumps1002 kerberos-run-command[296219]: at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1661) Nov 03 05:00:40 clouddumps1002 kerberos-run-command[296219]: ... 41 more Nov 03 05:00:40 clouddumps1002 systemd[1]: analytics-dumps-fetch-pageview_complete_dumps.service: Main process exited, code=exited, status=1/FAILURE Nov 03 05:00:40 clouddumps1002 systemd[1]: analytics-dumps-fetch-pageview_complete_dumps.service: Failed with result 'exit-code'. Nov 03 05:00:40 clouddumps1002 systemd[1]: analytics-dumps-fetch-pageview_complete_dumps.service: Consumed 57.019s CPU time.
Comment Actions
Andrew asked if I might know someone who knows something about this. I've never touched the kerb or hdfs stuff, but @elukey worked with the kerberos stuff in the modules/dumps/manifests/web/fetches/stats.pp manifest at one point, and @BTullis built a bunch of related packages in T310643 and so maybe knows something too.
Comment Actions
I believe that this has now been fixed. @Antoine_Quhen also raised a ticket about the matter and I have fixed the permissions and subsequently restarted the service in T322394: Update ownership of manually generated files on clouddumps1002