As part of the release of the pageviews dataset, we are deprecating pagecounts-raw and pagecounts-all-sites. An announcement will go out explaining why and how, and this task is meant to receive any discussion about this change. All thoughts welcome.
- stop oozie jobs / bundles etc.
- remove puppetized code that generates HTML (examples below)
- announce that this has been done (wait until at least end of June for Kaldari, but later is fine)
Just to write this down somewhere. After merge we need to do some manual steps:
- remove files from dumps box(es?): /etc/rsyncd.d/30-rsync-pagecounts.conf /usr/local/bin/daily-pagestats-copy.sh /usr/local/bin/generate-pagecount-main-index.sh /usr/local/bin/generate-pagecount-year-index.sh /usr/local/bin/generate-pagecount-year-month-index.sh
- remove crons from dumps box(es): pagestats-raw (datasets user) dataset-pagecounts_all_sites (datasets users)
remove 'refinery data check pagecounts' cron (from analytics1027?)