Page MenuHomePhabricator

Resolve deprecated data directory structure for elastic 6
Closed, ResolvedPublic

Description

In elasticsearch < 5.x data paths were created like:

path.data: /srv/elasticsearch
data actually stored in: /srv/elasticsearch/<cluster_name>/

In elasticsearch 5.x that was changed to:

data actually stored in: /srv/elasticsearch/

Elasticsearch 6 is going to stop allowing the server to even boot with the pre-5.x directory layout. On the next rolling restart between shutdown and startup we need to run:

mv /srv/elasticsearch/production-search-eqiad/nodes /srv/elasticsearch/nodes
rmdir /srv/elasticsearch/production-search-eqiad

Example logged deprecation warnings:
https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2018.06.19/elasticsearch/?id=AWQWYXHioOODFPKvh8wf

{
  "_index": "logstash-2018.06.19",
  "_type": "elasticsearch",
  "_id": "AWQWYXHioOODFPKvh8wf",
  "_version": 1,
  "_score": null,
  "_source": {
    "LoggerName": "org.elasticsearch.deprecation.env.NodeEnvironment",
    "source_host": "10.64.0.236",
    "level": "WARNING",
    "Severity": "WARN",
    "Time": "2018-06-19 04:50:42,0001",
    "type": "elasticsearch",
    "message": "ES has detected the [path.data] folder using the cluster name as a folder [/srv/elasticsearch], Elasticsearch 6.0 will not allow the cluster name as a folder within the data path",
    "normalized_message": "ES has detected the [path.data] folder using the cluster name as a folder [/srv/elasticsearch], Elasticsearch 6.0 will not allow the cluster name as a folder within the data path",
    "SourceMethodName": "deprecated",
    "Thread": "main",
    "tags": [
      "es",
      "gelf",
      "normalized_message_untrimmed"
    ],
    "@timestamp": "2018-06-19T04:50:42.141Z",
    "SourceSimpleClassName": "DeprecationLogger",
    "host": "elastic1035",
    "@version": "1",
    "SourceClassName": "org.elasticsearch.common.logging.DeprecationLogger",
    "gelf_level": "4",
    "SourceLineNumber": 292,
    "timestamp": "1529383842.001"
  },
  "fields": {
    "@timestamp": [
      1529383842141
    ]
  },
  "sort": [
    1529383842141
  ]
}

References:
https://github.com/elastic/elasticsearch/pull/18554
https://github.com/elastic/elasticsearch/issues/20391

Event Timeline

An alternative option, perhaps we can change path.data to /srv/elasticsearch/production-search-eqiad instead of moving the data. This will simplify the multi instance work, which can then use /srv/elasticsearch/<cluster name> as the path. Elasticsearch stopped doing this out of concern for non-ascii in the cluster names causing potential filesystem issues, but i think we can trust ourselves to only use ascii in cluster names.

It looks like since this was filed the servers have auto-magically removed the extra directory. https://github.com/elastic/elasticsearch/pull/18554 said this was supposed to happen automatically but it hadn't happened on our clusters when i filed the ticket. I'm not really sure what triggered the magic to finally happen, since we havn't upgraded elasticsearch versions since this ticket was filed.