Page MenuHomePhabricator

Change permissions for daily traffic anomaly reports on stat1007
Closed, ResolvedPublic5 Estimated Story Points


Hi. I am looking to maintain the daily traffic anomaly reports, currently hosted on stat1007; the code for this project is in /home/jdcc/project_monitoring. There is a submodule of the project that lives at /home/jdcc/heka as well. This is an important project that generates a daily report and it is not being maintained. (T215379 was a related ticket.)

I don't have all the write permissions to these two directories as some of the files are 644. I am part of the wikidev group so is it possible to change the permissions for that directory to either the group or the user (sukhe)?

I realize that ideally this project should be under a repository and that there are various components that need improvement (including documentation). However the intention right now is to keep this project operational, making changes where required without completely rewriting it. I also cannot copy it over to my ~ and run it from there as there are dependencies that I am not looking to resolve right now. A similar point was discussed in T215379#4940138.

I am subscribing the previous maintainer of the project (jdcc) so that we can take their permission as well.

Directories for which permission change (ideally w+x) has been requested:


Please let me know if an alternate approach is preferred.

Event Timeline

Milimetric triaged this task as High priority.
Milimetric added a project: Analytics-Kanban.
Milimetric moved this task from Incoming to Operational Excellence on the Analytics board.

@ssingh hi! Sorry for the delay. I am fine adding the group permissions but if the directory was marked in that way it was on purpose, maybe due to data protection? As far as I now @Jdcc-berkman is still active (namely holds a valid account / ssh-key), so he could probably chime in and give us some hint.

I could also move all that data to your home directory and chmod it properly, then we could fix the cron and everything should work as expected.

My memory of setting these perms is long gone, but if I did mark it this way, it was probably because there is private data from the stats servers sitting in there. It's WMF's data, so as far as I'm concerned, do whatever you need to with the perms.

@ssingh let's do things properly and move crons and repositories in one place. The dependencies that were discussed in the related task are basically two crons IIRC, so we can move them under your username first, and then possibly add them to puppet?

So to recap:

  • move /home/jdcc/project_monitoring and /home/jdcc/heka under your home directory, changing the owner (jddc -> sukhe) of all files/dir to you (so you'll have perms)
  • move jdcc's crons under your crontab
0 15 * * * USER=jdcc /home/jdcc/anaconda3/bin/python /home/jdcc/project_monitoring/scripts/ > /home/jdcc/project_monitoring/reports/`date -Idate`.html 2>> /home/jdcc/errors.log
0 10 * * * USER=jdcc /home/jdcc/anaconda3/bin/python /home/jdcc/article_monitoring/scripts/ > /home/jdcc/article_monitoring/reports/`date -Idate`.html 2>> /home/jdcc/errors.log
  • move slaporte's crontabs under your crontab:
46 23 * * * sh /home/slaporte/send_report.s

Does it sound ok?

@elukey Once this is puppetized, how much would it continue to depend on particular users retaining continued access?

@elukey: Sorry for the late reply.

Yes, that's fine with me as I do plan on maintaining this and having control over all the scripts will make it easier.

@ssingh late reply as well, apologies :)

I did the following:

  • copied over anaconda3, project_monitoring and heka from /home/jdcc to yours + chmod sukhe:wikidev to all
  • copied over from slaporte's home + chmod sukhe:wikidev

I also added to your crontab all the crons mentioned in my previous post, amended accordingly with your username. Moreover I have modified your version of the send_report to only mail things to you for the moment, with a different sender/mail-subject so you can compare it with the one sent by slaporte's user. Basically I left everything as it was for jdcc/slaport, and "cloned" a version of everything in your home dir. Once you'll give me the green light I'll clean up everything!

Hi @elukey, thanks for the ping.

The reports were not being generated so I have made some more changes today (updating paths); I will update the ticket with the progress but for now, let's keep the jdcc directory.

Thanks for your help.

Hi @elukey. Thanks for your patience on this issue. I can confirm everything is working from my ~.

@ssingh very happy to help! We should be ok to close the task right?

I can see that @Jdcc-berkman is still an active user - shall I keep the content of the /home/jdcc home directory or is it going to be deleted eventually? (trying to avoid leaving data around if not needed :)

If this is successfully running elsewhere now, I don't need the data in my home directory anymore nor my account. I've backed up the various scripts, so I'm good to be removed.

@Jdcc-berkman since you have access to the host, can you please clean up those files? So I'll not accidentally delete anything valuable etc.. Thanks :)

@elukey: Is it also possible to kill jdcc's currently running processes as they are no longer being used? The reason I ask is because I am getting this error when running the scripts:

Unexpected end of file when reading from HS2 server. The root cause might be too many concurrent connections. Please ask the administrator to check the number of active connections, and adjust hive.server2.thrift.max.worker.threads if applicable.

This possibly may be related to jdcc's scripts running in addition to mine (same scripts) and therefore creating too many concurrent connections,

$ ps -u jdcc | wc -l

Or perhaps @Jdcc-berkman can do it themselves...


Killed all the processes on stat1007 and also commented his crontab (so nothing should restart). Let me know if it is ok :)

Killed all the processes on stat1007 and also commented his crontab (so nothing should restart). Let me know if it is ok :)

Thanks very much! It's working now. :)

elukey lowered the priority of this task from High to Medium.Jun 3 2019, 6:27 AM
elukey set the point value for this task to 5.
elukey moved this task from In Progress to Done on the Analytics-Kanban board.

As follow up, let's make sure that we'll add this use case to puppet when ready, otherwise this problem will re-happen again.

Hi @elukey. The transition is working smoothly and there are no issues so we are good to go at least on my end.