We need to test if we can upgrade the elasticsearch cluster without updating the rest of the supporting logging infrastructure. There is a role in vagrant which should allow us to do the testing, although it may be somewhat out of date and require a few tweaks.
|Resolved||bd808||T137400 Logstash elasticsearch mapping does not allow err.code to be a string|
|Resolved||• dcausse||T138608 Vagrant role provisions wrong elasticsearch version|
|Resolved||debt||T136001 [EPIC] Upgrade elasticsearch cluster supporting logging to 2.3|
|Resolved||EBernhardson||T136003 Upgrade logstash*.eqiad.wmnet elasticsearch servers to 2.3|
|Resolved||EBernhardson||T136002 Test logstash 1.5.3 and kibana 3.1.2 against elasticsearch 2.3|
Manually upgrading the beta cluster or resurrecting the logstash Labs project may be easier to do. ::role::logstash does not include all of the Logstash configuration and plugin usage from production.
I haven't started testing yet, but per the elastic support matrix:
Kibana 3.1.x is only supported against elasticsearch 0.9.0 through 1.7.x. To match the support matrix we would need to upgrade kibana to 4.2.0+
Logstash reports as all versions being compatible with all versions of elasticsearch 1.0.0+. Will need testing to verify our case though.
In logstash as part of the mapping for the "tags" field we set index_name to "tag". The index_name property for mappings no longer exists in 2.x and needs to be removed. This looks to be used strictly for convenience and can be flatly removed. May need to check that stored queries aren't using this shorthand. This may also cause issues for migration, as all the old indices may not be openable. Will have to test.
Looks like at the same time the "path" property was removed. This is used by the "geoip" property. Not sure but i think this can be simply dropped as well.
Some further digging turned up the elasticsearch issue about it: https://github.com/elastic/elasticsearch/issues/6677
Based on this migration shouldn't be a problem. ES 2.x can open an index with these properties, but it will refuse to create them. As such the upgrade from 1.7 -> 2.x should (in theory, need to test) go smoothly.
There are a few issues that would need to be addressed to upgrade to a modern version of Kibana:
- Kibana 3 (what we run now) is a static web application. Kibana 4+ is a nodejs service. The Puppet deploy tooling would need to be changed to deal with this.
- Changing deploy tooling would probably also mean migrating from trebuchet to scap3 for deployments.
- Kibana 4 had a long standing bug related to timezones in the display (times were always shown in browser local time). I think that has been fixed now, but there is still a bug with timezone effecting the date picker. To be fair, I think a similar bug exists with the date picker in the Kibana 3 branch that we run.
- Kibana 4 stores dashboards in a new index and format with no migration tools so all existing doashboards would need to be rebuilt manually.
Most of the things that Kibana3 does are based on facets which Elasticsearch 2.x replaced with the aggregations API. It might be possible to patch Kibana3 to use aggregations instead of facets when generating Elasticsearch API requests. I haven't looked into that to see if it would be less work that dealing with all of the other changes.
Just noticed the facets thing now that I have data flowing into logstash. Indeed any request that uses facets simply fails because elasticsearch doesn't know what to do with them anymore. Will poke over the code to see what it's doing there and if it's reasonable to fix, or if it gets quite complicated.
It looks like supporting aggregations in kibana 3 would require updating the elasticjs dependency from 1.1.1 to 1.2.0, or possibly backporting the aggregations directory from 1.2.0 to 1.1.1. I'm going to spend a day on this tomorrow and see how feasible it is. If it looks to be a ton of work will start looking into the ideas around upgrading to kibana 4.
TBH we might just want to go to 4 anyways, we plan to be upgrading to elasticsearch 5 in the Jan - Mar timeframe, so making old kibana work with new elasticsearch might only buy us a few months before we have to do a major upgrade anyways.
That's reasonable. Kibana and Logstash updates haven't happened frequently mostly due to lack of resources to work on them. The change in requirements for Kibana4 and the timezone display issue made me less excited about spending my volunteer hours to work on it as log as things were otherwise operational. Once we get over the hurdle of making things work with the new node service it should be relatively easy to keep Kibana updated again.
Even rebuilding the dashboards shouldn't be a horrible task. Most of the dashboards that are well advertised follow a common template. We might be able to figure out how to script a migration for our specific uses once we know what changes are needed. At some point in the distant past I played with Kibana4 by running it locally on my laptop with an ssh tunnel to get to the Elasticsearch cluster. I can make some time to try to set that up again and figure out what the dashboard config changes might look like.
I've created a variety of tasks around upgrading to Kibana 4. I'm currently hopeful we can avoid upgrading logstash itself. We might need to backport https://www.elastic.co/guide/en/logstash/current/plugins-filters-de_dot.html but otherwise it looks like everything should "just work"