|operations/puppet||production||+3 -3||camus: set proper number of consumers for Webrequest|
|analytics/refinery||master||+10 -536||Remove cache misc from Refinery|
|Resolved||None||T164609 Merge cache_misc into cache_text functionally|
|Resolved||elukey||T200822 Remove webrequest misc analytics related jobs and code after cache misc -> text merge is complete|
Our dear cache misc (the Varnish hosts that were hosting misc websites like phabricator, yarn, etc..) has now been merged into cache text. When reviewing all the Analytics Hadoop jobs referencing cache misc in https://gerrit.wikimedia.org/r/#/c/459827/, we found also wdqs_extract. Is it still used? I am asking because we have two options now:
- delete it if not needed
- fix it (as it is done in the code review linked above) and also try to figure out how to fill the data gaps happened during the past weeks (from when the Traffic team merged misc in to text up to now). Since the wdqs_extract kept using cache misc, it was not pulling any relevant data (that was in cache text) probably leading to a nice flatline.
Let me know!
It's already been broken for a few weeks. We don't need to delete the data at all, but the change Luca is working on will cause this job to run with the webrequest_text data partition, which is a lot more than webrequest_misc. We can do it, but we'd rather not if we don't have to!
Luca, I suggest removing the job and if we hear back otherwise we can re-add it then.
Thanks for the comments! I've updated https://gerrit.wikimedia.org/r/#/c/analytics/refinery/+/459827/ to remove the job, this patch will be deployed during or after the Analytics offsite probably :)