[EPIC] Reconfigure Discovery-Stats on Analytics Cluster
Closed, ResolvedPublic

Description

statistics::discovery runs stats from analytics/discovery-stats which is deprecated and formerly owned by Max from before the tune-up.

We need to repurpose that puppet configuration to run wikimedia/discovery/golden scripts instead, which are currently set up as a cronjob under my account on stat1002:

0 5 * * * cd /a/discovery/golden/ && sh main.sh >> /home/bearloga/discovery-golden.log 2>&1

Also vaguely related to T129260

mpopov created this task.Jul 12 2017, 9:57 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 12 2017, 9:57 PM

@Ottomata: is it OK if we don't get around to this until after stat1005 goes live?

@mpopov, it is live! :)

It just isn't announced yet. I'm trying to migrate all puppetized/automated jobs before I start the cat herding with some annoucements. So, ya, you can delay, but that makes it harder for me to get the puppetization of the boxes completely ready for the annoucement.

This isn't blocking anything yet, but the sooner it is done, the easier it is for me to keep cleaning up.

@mpopov, do I need to bother migrating the existing statistics::discovery stuff then? If possible we should probably remove this class, and then create a new more discovery specific one.

@mpopov, do I need to bother migrating the existing statistics::discovery stuff then? If possible we should probably remove this class, and then create a new more discovery specific one.

It is specific to Discovery :) Max (when he was with Discovery and when Discovery still existed) made it for some of Discovery's stats but it's deprecated, so we can either remove it or repurpose it.

Right, ha, I mean, I'd probably refactor your new stuff out of the statistics::discovery class, instead something else in puppet.

If you don't plan on moving over the stuff in statistics::discovery to stat1005, then great! I can just not migrate it, and we can make something new for your newer discovery jobs.

I just realized that reworking discovery-stats properly will require R package installation stuff from https://gerrit.wikimedia.org/r/#/c/366170/

mpopov claimed this task.
mpopov raised the priority of this task from Normal to High.
mpopov moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board.

Change 367930 had a related patch set uploaded (by Bearloga; owner: Bearloga):
[operations/puppet@production] statistics::discovery: Reconfigure for Golden data retrieval

https://gerrit.wikimedia.org/r/367930

Change 367931 had a related patch set uploaded (by Bearloga; owner: Bearloga):
[wikimedia/discovery/golden@master] Compatibility with Puppetized runs

https://gerrit.wikimedia.org/r/367931

Change 367930 merged by Ottomata:
[operations/puppet@production] statistics::discovery: Reconfigure for Golden data retrieval

https://gerrit.wikimedia.org/r/367930

Change 368829 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Include statistics::discovery just in private profile

https://gerrit.wikimedia.org/r/368829

Change 368829 merged by Ottomata:
[operations/puppet@production] Include statistics::discovery just in private profile

https://gerrit.wikimedia.org/r/368829

Change 368831 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Remove re-declaration of 'wikidev' group in discovery class

https://gerrit.wikimedia.org/r/368831

Change 368831 merged by Ottomata:
[operations/puppet@production] Remove re-declaration of 'wikidev' group in discovery class

https://gerrit.wikimedia.org/r/368831

Change 367931 merged by Bearloga:
[wikimedia/discovery/golden@master] Compatibility with Puppetized runs

https://gerrit.wikimedia.org/r/367931

Change 369438 had a related patch set uploaded (by Bearloga; owner: Bearloga):
[operations/puppet@production] statistics::discovery: Fix scheduled command

https://gerrit.wikimedia.org/r/369438

Change 369438 merged by Ottomata:
[operations/puppet@production] statistics::discovery: Fix scheduled command

https://gerrit.wikimedia.org/r/369438

Change 371769 had a related patch set uploaded (by Bearloga; owner: Bearloga):
[operations/puppet@production] statistics::discovery: Manage datasets dir

https://gerrit.wikimedia.org/r/371769

Change 371769 abandoned by Bearloga:
statistics::discovery: Manage datasets dir

Reason:
patching reportupdater to write files with correct permissions instead (I2f5c0138f0df7b19ff0658322b9e7c989e58a7ac)

https://gerrit.wikimedia.org/r/371769

mpopov renamed this task from Reconfigure Discovery-Stats on Analytics Cluster to [EPIC] Reconfigure Discovery-Stats on Analytics Cluster.Aug 24 2017, 9:18 PM
mpopov removed a project: Patch-For-Review.
mpopov updated the task description. (Show Details)
mpopov removed the point value for this task.

Change 373938 had a related patch set uploaded (by Bearloga; owner: Bearloga):
[operations/puppet@production] statistics::private: Disable Discovery

https://gerrit.wikimedia.org/r/373938

Change 373938 abandoned by Bearloga:
statistics::private: Disable Discovery

Reason:
Abandoning in favor of Gehel's I2ae9d3f333d364128f82457680508da287f76477

https://gerrit.wikimedia.org/r/373938

mpopov changed the task status from Open to Stalled.Mar 6 2018, 2:15 AM

Waiting for systems users with private data access to become available.

Neil_P._Quinn_WMF lowered the priority of this task from High to Low.

Change 438125 had a related patch set uploaded (by Bearloga; owner: Bearloga):
[operations/puppet@production] statistics::discovery: re-enable cron job

https://gerrit.wikimedia.org/r/438125

mpopov changed the task status from Stalled to Open.Jun 7 2018, 11:28 PM

Changing status as system users can now have private data access thanks to Andrew.

Change 438125 merged by Ottomata:
[operations/puppet@production] statistics::discovery: re-enable cron job

https://gerrit.wikimedia.org/r/438125

Disabled the cron job in my account's crontab on stat1005. Will check tomorrow if everything works as it should and the metric calculation by the analytics-search (system) user goes without problems then this task will be done.

mpopov closed this task as Resolved.Jun 19 2018, 4:19 PM

Yay! Everything is working correctly now. We ran into a problem with file ownership but the ever-excellent @Ottomata fixed that up and we are now fully puppetized.