As part of the release of the pageviews dataset, we are deprecating pagecounts-raw and pagecounts-all-sites. An announcement will go out explaining why and how, and this task is meant to receive any discussion about this change. All thoughts welcome.
Description
Details
Project | Branch | Lines +/- | Subject | |
---|---|---|---|---|
analytics/refinery | master | +0 -1 K | Remove pagecounts-[raw|all-sites] related code | |
operations/puppet | production | +9 -565 | Remove pagecounts-[raw|all-sites] related code |
Related Objects
Event Timeline
- stop oozie jobs / bundles etc.
- remove puppetized code that generates HTML (examples below)
- announce that this has been done (wait until at least end of June for Kaldari, but later is fine)
https://github.com/wikimedia/operations-puppet/blob/3218df65dcc4c9d42ce6deef0e130db817613f58/modules/dataset/files/pagecounts/daily-pagestats-copy.sh
https://github.com/wikimedia/operations-puppet/blob/3218df65dcc4c9d42ce6deef0e130db817613f58/modules/dataset/files/pagecounts/generate-pagecount-main-index.sh#L11
Change 302932 had a related patch set uploaded (by Joal):
Remove pagecounts-[raw|all-sites] related code
Just to write this down somewhere. After merge we need to do some manual steps:
- remove files from dumps box(es?): /etc/rsyncd.d/30-rsync-pagecounts.conf /usr/local/bin/daily-pagestats-copy.sh /usr/local/bin/generate-pagecount-main-index.sh /usr/local/bin/generate-pagecount-year-index.sh /usr/local/bin/generate-pagecount-year-month-index.sh
- remove crons from dumps box(es): pagestats-raw (datasets user) dataset-pagecounts_all_sites (datasets users)
remove 'refinery data check pagecounts' cron (from analytics1027?)
Change 303164 had a related patch set uploaded (by Joal):
Remove pagecounts-[raw|all-sites] related code
Code changes done and applied.
I'll check status on monday (html looks good, data is available), then send the email