Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | odimitrijevic | T240437 Analytics Ops Technical Debt | |||
Resolved | elukey | T243934 Unify puppet roles for stat and notebook hosts | |||
Resolved | elukey | T249754 Unify stat1007 puppet role with the rest of the stats cluster |
Event Timeline
Change 589542 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] statistics::rsync::mediawiki: reduce retention and improve security
Change 589549 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::statistics: share mediawiki/eventlogging profiles
Change 589549 merged by Elukey:
[operations/puppet@production] role::statistics: share mediawiki/eventlogging profiles
Change 589553 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::statistics::explorer: add profiles to match role::statistics::private
Change 589542 merged by Elukey:
[operations/puppet@production] statistics::rsync::mediawiki: reduce retention and improve security
Change 591310 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::statistics::private: factor out geoip archive to a profile
Change 591310 merged by Elukey:
[operations/puppet@production] role::statistics::private: factor out geoip archive to a profile
Change 589553 merged by Elukey:
[operations/puppet@production] role::statistics::explorer: add profiles to match role::statistics::private
Change 593191 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::statistics::private: factor out wmde/discovery crons out
Change 593191 merged by Elukey:
[operations/puppet@production] role::statistics::private: factor out wmde/discovery crons out
Ok there is only one thing left to move, namely:
# Allowing statistics nodes (mostly clouddb hosts in this case) # to push nginx access logs to a specific /srv path. We usually # allow only pull based rsyncs, but after T211330 we needed a way # to unbreak that use case. This rsync might be removed in the future. rsync::server::module { 'dumps-webrequest': path => '/srv/log/webrequest/archive/dumps.wikimedia.org', read_only => 'no', hosts_allow => $labstore_hosts, auto_ferm => true, }
Context: https://phabricator.wikimedia.org/T119070
It should be easy to move it to role::statistics::explorer but I would love to drop this if not needed anymore. I see two possible way forward:
- the data is not needed anymore, we just need to do the cleanup.
- the data is needed, so we should probably instruct the labstore nodes to push the logs to HDFS instead, and change the consumers of the data to pull from it.
@Addshore can you shed some light about your use case? Still needed?
10:10 AM <addshore> right now we do still use it
10:11 AM <addshore> it currently powers https://grafana.wikimedia.org/d/000000264/wikidata-dump-downloads?orgId=1&refresh=5m&from=now-90d&to=now
10:11 AM <addshore> currently via https://github.com/wikimedia/analytics-wmde-scripts/blob/master/src/wikidata/dumpDownloads.php
10:12 AM <addshore> would hdfs -> report updater work?
10:12 AM <addshore> (we havn't really used report updater yet)
and change the consumers of the data to pull from it.
@Addshore if these files are small enough (haven't checked), you might even be able to run your script on the files in HDFS through the HDFS mount at /mnt/hdfs. This isn't the best solution, but it would work most of the time and might be ok for non-critical jobs.
Change 593521 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::statistics::explorer::misc_jobs: get more jobs from stat1007
Change 593521 merged by Elukey:
[operations/puppet@production] profile::statistics::explorer::misc_jobs: get more jobs from stat1007
Change 594941 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Assign role::statistics::explorer to stat1007
Change 594941 merged by Elukey:
[operations/puppet@production] Assign role::statistics::explorer to stat1007
This is done now, further refinements might be needed but I'd say that this task can be closed!