Page MenuHomePhabricator

Unify stat1007 puppet role with the rest of the stats cluster
Closed, ResolvedPublic13 Estimated Story Points

Event Timeline

Milimetric moved this task from Incoming to Operational Excellence on the Analytics board.

Change 589542 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] statistics::rsync::mediawiki: reduce retention and improve security

https://gerrit.wikimedia.org/r/589542

Change 589549 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::statistics: share mediawiki/eventlogging profiles

https://gerrit.wikimedia.org/r/589549

Change 589549 merged by Elukey:
[operations/puppet@production] role::statistics: share mediawiki/eventlogging profiles

https://gerrit.wikimedia.org/r/589549

Change 589553 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::statistics::explorer: add profiles to match role::statistics::private

https://gerrit.wikimedia.org/r/589553

Change 589542 merged by Elukey:
[operations/puppet@production] statistics::rsync::mediawiki: reduce retention and improve security

https://gerrit.wikimedia.org/r/589542

Change 591310 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::statistics::private: factor out geoip archive to a profile

https://gerrit.wikimedia.org/r/591310

Change 591310 merged by Elukey:
[operations/puppet@production] role::statistics::private: factor out geoip archive to a profile

https://gerrit.wikimedia.org/r/591310

Change 589553 merged by Elukey:
[operations/puppet@production] role::statistics::explorer: add profiles to match role::statistics::private

https://gerrit.wikimedia.org/r/589553

Change 593191 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::statistics::private: factor out wmde/discovery crons out

https://gerrit.wikimedia.org/r/593191

Change 593191 merged by Elukey:
[operations/puppet@production] role::statistics::private: factor out wmde/discovery crons out

https://gerrit.wikimedia.org/r/593191

Ok there is only one thing left to move, namely:

# Allowing statistics nodes (mostly clouddb hosts in this case)
# to push nginx access logs to a specific /srv path. We usually
# allow only pull based rsyncs, but after T211330 we needed a way
# to unbreak that use case. This rsync might be removed in the future.
rsync::server::module { 'dumps-webrequest':
    path        => '/srv/log/webrequest/archive/dumps.wikimedia.org',
    read_only   => 'no',
    hosts_allow => $labstore_hosts,
    auto_ferm   => true,
}

Context: https://phabricator.wikimedia.org/T119070

It should be easy to move it to role::statistics::explorer but I would love to drop this if not needed anymore. I see two possible way forward:

  • the data is not needed anymore, we just need to do the cleanup.
  • the data is needed, so we should probably instruct the labstore nodes to push the logs to HDFS instead, and change the consumers of the data to pull from it.

@Addshore can you shed some light about your use case? Still needed?

10:10 AM <addshore> right now we do still use it
10:11 AM <addshore> it currently powers https://grafana.wikimedia.org/d/000000264/wikidata-dump-downloads?orgId=1&refresh=5m&from=now-90d&to=now
10:11 AM <addshore> currently via https://github.com/wikimedia/analytics-wmde-scripts/blob/master/src/wikidata/dumpDownloads.php
10:12 AM <addshore> would hdfs -> report updater work?
10:12 AM <addshore> (we havn't really used report updater yet)

 and change the consumers of the data to pull from it.

@Addshore if these files are small enough (haven't checked), you might even be able to run your script on the files in HDFS through the HDFS mount at /mnt/hdfs. This isn't the best solution, but it would work most of the time and might be ok for non-critical jobs.

Change 593521 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::statistics::explorer::misc_jobs: get more jobs from stat1007

https://gerrit.wikimedia.org/r/593521

Change 593521 merged by Elukey:
[operations/puppet@production] profile::statistics::explorer::misc_jobs: get more jobs from stat1007

https://gerrit.wikimedia.org/r/593521

Change 594941 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Assign role::statistics::explorer to stat1007

https://gerrit.wikimedia.org/r/594941

Change 594941 merged by Elukey:
[operations/puppet@production] Assign role::statistics::explorer to stat1007

https://gerrit.wikimedia.org/r/594941

This is done now, further refinements might be needed but I'd say that this task can be closed!

elukey moved this task from In Progress to Done on the Analytics-Kanban board.
Nuria set the point value for this task to 13.