Page MenuHomePhabricator

Migrate from deployment-logstash2 (jessie) to deployment-logstash03 (stretch)
Closed, ResolvedPublic

Description

Already partially discussed on T218729 but I noticed the sheer number of subscribers to that task and opted to make a subtask for this individual instance.

Related Objects

Event Timeline

If deployment-logstash03 has the same classes applied than deployment-logstash2

and no puppet errors

I'd say the next step would be to switch producers to use deployment-logstash03

hmmm, looks like references to this host are a little scattered:

alex@alex-laptop:~/Development/Wikimedia/Operations-Puppet (production)$ git grep deployment-logstash2
hieradata/labs.yaml:role::logging::mediawiki::udp2log::logstash_host: 'deployment-logstash2.deployment-prep.eqiad.wmflabs'
hieradata/labs/deployment-prep/common.yaml:  - "deployment-logstash2.deployment-prep.eqiad.wmflabs:10514"
hieradata/labs/deployment-prep/common.yaml:service::configuration::logstash_host: deployment-logstash2.deployment-prep.eqiad.wmflabs
hieradata/labs/deployment-prep/common.yaml:  logstash_host: deployment-logstash2.deployment-prep.eqiad.wmflabs
hieradata/labs/deployment-prep/common.yaml:logstash_host: deployment-logstash2.deployment-prep.eqiad.wmflabs
hieradata/labs/deployment-prep/common.yaml:  - 'deployment-logstash2.deployment-prep.eqiad.wmflabs:9093'
hieradata/labs/deployment-prep/host/deployment-logstash2.yaml:      - deployment-logstash2.deployment-prep.eqiad.wmflabs
hieradata/labs/deployment-prep/host/deployment-logstash2.yaml:      - deployment-logstash2.deployment-prep.eqiad.wmflabs
hieradata/labs/deployment-prep/host/deployment-logstash2.yaml:role::kibana::serveradmin: root@deployment-logstash2.deployment-prep.eqiad.wmflabs
hieradata/labs/wikidata-query/common.yaml:profile::query_service::logstash_host: 'deployment-logstash2.deployment-prep.eqiad.wmflabs'
modules/base/manifests/remote_syslog.pp:#   (e.g. ["centrallog1001.eqiad.wmnet"] or ["deployment-logstash2.deployment-prep.eqiad.wmflabs:10514"])
modules/role/manifests/beta/puppetmaster.pp:        logstash_host => 'deployment-logstash2.deployment-prep.eqiad.wmflabs',
modules/scap/templates/scap.cfg.erb:logstash_host: deployment-logstash2.deployment-prep.eqiad.wmflabs:9200

Guess I'll upload a puppet commit to update deployment-prep/common.yaml and the non-hieradata stuff.

alex@alex-laptop:~/Development/Wikimedia/instance-puppet (master)$ git grep deployment-logstash2
deployment-prep/_.yaml:      deployment-logstash2.deployment-prep.eqiad.wmflabs:
deployment-prep/_.yaml:      deployment-logstash2.deployment-prep.eqiad.wmflabs:
deployment-prep/_.yaml:- deployment-logstash2.deployment-prep.eqiad.wmflabs:9093
deployment-prep/_.yaml:service::configuration::logstash_host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-aqs.yaml:profile::aqs::logstash_host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-aqs.yaml:  logstash_host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-docker-cxserver01.deployment-prep.eqiad.wmflabs.yaml:        - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-1.deployment-prep.eqiad.wmflabs.roles:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-1.deployment-prep.eqiad.wmflabs.roles:              metadata.broker.list: deployment-logstash2.deployment-prep.eqiad.wmflabs:9092
deployment-prep/deployment-eventgate-1.deployment-prep.eqiad.wmflabs.roles:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-1.deployment-prep.eqiad.wmflabs.roles:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-1.deployment-prep.eqiad.wmflabs.yaml:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-1.deployment-prep.eqiad.wmflabs.yaml:              metadata.broker.list: deployment-logstash2.deployment-prep.eqiad.wmflabs:9092
deployment-prep/deployment-eventgate-1.deployment-prep.eqiad.wmflabs.yaml:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-1.deployment-prep.eqiad.wmflabs.yaml:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-2.deployment-prep.eqiad.wmflabs.yaml:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-2.deployment-prep.eqiad.wmflabs.yaml:              metadata.broker.list: deployment-logstash2.deployment-prep.eqiad.wmflabs:9092
deployment-prep/deployment-eventgate-2.deployment-prep.eqiad.wmflabs.yaml:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-2.deployment-prep.eqiad.wmflabs.yaml:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-3.deployment-prep.eqiad.wmflabs.yaml:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-3.deployment-prep.eqiad.wmflabs.yaml:              metadata.broker.list: deployment-logstash2.deployment-prep.eqiad.wmflabs:9092
deployment-prep/deployment-eventgate-3.deployment-prep.eqiad.wmflabs.yaml:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-eventgate-3.deployment-prep.eqiad.wmflabs.yaml:      - host: deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-logstash2.deployment-prep.eqiad.wmflabs.yaml:  - deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-logstash2.deployment-prep.eqiad.wmflabs.yaml:  - deployment-logstash2.deployment-prep.eqiad.wmflabs
deployment-prep/deployment-mediawiki-.yaml:- deployment-logstash2.deployment-prep.eqiad.wmflabs:9093
deployment-prep/deployment-sessionstore.yaml:  logstash_host: deployment-logstash2.deployment-prep.eqiad.wmflabs
ores/_.yaml:logstash_host: deployment-logstash2.eqiad.wmflabs
phabricator/_.yaml:mediawiki::forward_syslog: deployment-logstash2.deployment-prep.eqiad.wmflabs:10514
striker/striker-uwsgi.yaml:    LOGSTASH_HOST: deployment-logstash2.eqiad.wmflabs
wikidata-query/_.yaml:wdqs::logstash_host: deployment-logstash2.deployment-prep.eqiad.wmflabs

(that .roles file should actually just be a list of roles, think this was a bug with the import script that set up the repo, have told Andrew)
the only deployment-eventgate host existing now is -3 so I think some of this is just a lack of old hieradata getting deleted on instance deletion (edit: T238708)

I'll update deployment-prep stuff, I'm not sure anything outside the project should be communicating with this.

and the proxy to logstash-beta.wmflabs.org. It might help with T233134: logstash-beta.wmflabs.org does not receive any mediawiki events too

(also apparently kibana4.wmflabs.org)

Change 551946 had a related patch set uploaded (by Alex Monk; owner: Alex Monk):
[operations/puppet@production] deployment-prep: Migrate to new logstash host

https://gerrit.wikimedia.org/r/551946

Mentioned in SAL (#wikimedia-releng) [2019-11-20T00:47:34Z] <Krenair> T238707 moved kibana4/logstash-beta proxies to deployment-logstash03, copied /etc/logstash/htpasswd file

Mentioned in SAL (#wikimedia-releng) [2019-11-20T00:51:19Z] <Krenair> T238707 created old-logstash-beta proxy to point at old instance, created default Index Pattern on new logstash-beta

Mentioned in SAL (#wikimedia-releng) [2019-11-20T00:53:47Z] <Krenair> T238707 changed dateFormat (under management -> advanced settings) from 'MMMM Do YYYY, HH:mm:ss.SSS' to 'YYYY-MM-DDTHH:mm:ss', and dateFormat:tz from Browser to UTC to match old instance

Wondering what we need to do next. Do we need to copy dashboards over somehow?

Change 551946 merged by Andrew Bogott:
[operations/puppet@production] deployment-prep: Migrate to new logstash host

https://gerrit.wikimedia.org/r/551946

Wondering what we need to do next. Do we need to copy dashboards over somehow?

Thanks for working on this! Good point re: dashboards, they live in the .kibana index. If the new elasticsearch cluster has access to the old one the easiest option is probably to use the reindex api, e.g.

curl -X POST "localhost:9200/_reindex" -H 'Content-Type: application/json' -d"
  {
    \"source\": {
      \"remote\": {
        \"host\": \"http://${source}\"
      },
      \"index\": \".kibana\"
    },
    \"dest\": {
      \"index\": \".kibana\"
    }
  }
"

If the new elasticsearch cluster has access to the old one

How do I tell? And how would I fix it if not?

looks like each of the logstash hosts runs its own elasticsearch cluster locally, would our source be something like deployment-logstash2.deployment-prep.eqiad.wmflabs:9200 or deployment-logstash2.deployment-prep.eqiad.wmflabs:9300 ? it seems we'd need to configure reindex.remote.whitelist somewhere too though I have no idea where

looks like each of the logstash hosts runs its own elasticsearch cluster locally, would our source be something like deployment-logstash2.deployment-prep.eqiad.wmflabs:9200 or deployment-logstash2.deployment-prep.eqiad.wmflabs:9300 ? it seems we'd need to configure reindex.remote.whitelist somewhere too though I have no idea where

You'd be launching the reindex call on logstash3 setting logstash2:9200 as the remote source, you are correct that reindex.remote.whitelist needs to be set in elasticsearch.yml! Alternatively a dump/reload scheme would also work, e.g. with https://github.com/taskrabbit/elasticsearch-dump (never tried it though)

Alright, I:

root@deployment-logstash03:~# curl -X POST "localhost:9200/_reindex" -H 'Content-Type: application/json' -d"
  {
    \"source\": {
      \"remote\": {
        \"host\": \"http://deployment-logstash2.deployment-prep.eqiad.wmflabs:9200\"
      },
      \"index\": \".kibana\"
    },
    \"dest\": {
      \"index\": \".kibana\"
    }
  }
"
{"took":841,"timed_out":false,"total":151,"updated":0,"created":151,"deleted":0,"batches":1,"version_conflicts":0,"noops":0,"retries":{"bulk":0,"search":0},"throttled_millis":0,"requests_per_second":-1.0,"throttled_until_millis":0,"failures":[]}
  • re-enabled and ran puppet on those two hosts

and some dashboards have appeared at https://logstash-beta.wmflabs.org/app/kibana#/dashboards?_g=() just like on old-logstash-beta

Do we need to do anything else?

AFAIK if dashboards have been migrated then deployment-logstash02 should be ready to be turned off

Mentioned in SAL (#wikimedia-releng) [2019-11-26T01:02:15Z] <Krenair> Shut down deployment-logstash2 T238707

Something happened with this in T233134#5713956 though I don't really understand what is required.

*bump* I would love to see this VM deleted since it confuses cumin (T222480)

Note: other Cloud VPS projects (wikidata-query, striker, ores, phabricator) appear to also be using deployment-logstash2. Not sure if they are actually using it but those at least have hiera keys pointing to logstash2.

In T238707#6928591, @Majavah wrote:

Note: other Cloud VPS projects (wikidata-query, striker, ores, phabricator) appear to also be using deployment-logstash2. Not sure if they are actually using it but those at least have hiera keys pointing to logstash2.

Shall we just wholesale point these to deployment-logstash03? Even if some turn out to be unused or broken, that's still better than sending them to a server which will soon need to be removed :-)

In T238707#6928591, @Majavah wrote:

Note: other Cloud VPS projects (wikidata-query, striker, ores, phabricator) appear to also be using deployment-logstash2. Not sure if they are actually using it but those at least have hiera keys pointing to logstash2.

Shall we just wholesale point these to deployment-logstash03? Even if some turn out to be unused or broken, that's still better than sending them to a server which will soon need to be removed :-)

Likely yes, but I'm not a project admin on those projects and have not found time or motivation go thru all of them and contact their maintainers. Ideally that would be turned to a service record instead of pointing to individual hosts, maybe something like logstash.svc.deployment-prep.eqiad1.wikimedia.cloud (which is now possible, T276624).

Change 674392 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Remove deployment-logstash2

https://gerrit.wikimedia.org/r/674392

In T238707#6937543, @Majavah wrote:

Shall we just wholesale point these to deployment-logstash03? Even if some turn out to be unused or broken, that's still better than sending them to a server which will soon need to be removed :-)

Likely yes, but I'm not a project admin on those projects and have not found time or motivation go thru all of them and contact their maintainers.

I have created https://gerrit.wikimedia.org/r/674392 and will simply CC a few people who should be able to fix it to the patch.

Change 674392 merged by Muehlenhoff:
[operations/puppet@production] Remove deployment-logstash2

https://gerrit.wikimedia.org/r/674392

I've merged https://gerrit.wikimedia.org/r/674392 and shut down deployment-logstash2, it can be removed for good in a few days. Puppet was broken on this instance since September 2020, so if anything really still used it, it would probably be broken anyway...

Mentioned in SAL (#wikimedia-releng) [2021-03-24T07:42:30Z] <Majavah> remove deployment-logstash2 hiera from horizon, instahce was shut off earlier by moritzm T238707

taavi assigned this task to 30000lightyears.
taavi removed 30000lightyears as the assignee of this task.
taavi added a subscriber: 30000lightyears.
taavi removed a subscriber: 30000lightyears.