Page MenuHomePhabricator

Create docker elasticsearch images with wmf search plugins
Closed, ResolvedPublic3 Estimated Story Points

Description

As a maintainer of CirrusSearch I want to have access to a docker image running elasticsearch with the wmf search plugins installed so that I can develop with all the features available on our production wikis.

As of today an engineer willing to work on a docker-compose dev environment only has access to the official elasticsearch images, some important features like regex, highlighting, language analyzers do require the installation of the set of plugins we use in production (debian package named: wmf-elasticsearch-search-plugins).
The image hosted at https://gitlab.wikimedia.org/repos/releng/dev-images/-/tree/main/dockerfiles/elasticsearch should probabaly be adapted to install the content of this debian package.

AC:

Details

ReferenceSource BranchDest BranchAuthorTitle
repos/releng/dev-images!15work/ebernhardson/elastic-plugins-710mainebernhardsonelasticsearch: Update to 7.10.2
repos/releng/dev-images!14work/ebernhardson/elastic-plugins-68mainebernhardsonelasticsearch: Update to 6.8.23
repos/releng/dev-images!12work/ebernhardson/elastic-pluginsmainebernhardsonelasticsearch: install plugins from production packaging
Customize query in GitLab

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

There is already an image in releng/dev-images repo created for T231864, fwiw, but that is only for 6.5.4 so far.

There is already an image in releng/dev-images repo created for T231864, fwiw, but that is only for 6.5.4 so far.

Thanks, I forgot about this one! changed the description accordingly to make sure we reuse it.

Merge Request submitted for the first part, migrating to the production package: https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/12

For the second half of this ticket, releasing an image for each of 6.5.4, 6.8.23, and 7.10.2 i wonder what the best approach to versioning these images is:

  • Current strategy, increment an independant version, currently at 0.1.0. The versioning implies we will follow semver, although it's also currently at 0.1.0 and semver doesn't imply much about compat in 0.x.
  • Upstream strategy, version the image based on whats included. We would need to include the plugins version as well, giving something like 6.5.4-7 as a version number. But then if the container building needs to change without a matching plugin package release we need a third number, giving 6.5.4-7-1 or some such.
  • Some other scheme?

In terms of discoverability and ease of use, i think using the upstream version number has a lot of benefits. It's a bit awkward though, is it worth it?

Merge Request submitted for the first part, migrating to the production package and parameterizing the installation: https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/12

For the second half of this ticket, releasing an image for each of 6.5.4, 6.8.23, and 7.10.2 i wonder what the best approach to versioning these images is:

  • Current strategy, increment an independant version, currently at 0.1.0. The versioning implies we will follow semver, although it's also currently at 0.1.0 and semver doesn't imply much about compat in 0.x.
  • Upstream strategy, version the image based on whats included. We would need to include the plugins version as well, giving something like 6.5.4-7 as a version number. But then if the container building needs to change without a matching plugin package release we need a third number, giving 6.5.4-7-1 or some such.
  • Some other scheme?

In terms of discoverability and ease of use, i think using the upstream version number has a lot of benefits. It's a bit awkward though, is it worth it?

FWIW, Quibble docker images are following the {software-version}-{series-release} paradigm in the integration/config repo. So e.g. for quibble version 1.4.0, the first image is built as 1.4.0-s0, and if there are other changes made to the image, but quibble itself is still at 1.4.0, then the image is bumped to 1.4.0-s1. That seems to more or less work out OK from my perspective, but I'm cc'ing @hashar and @Jdforrester-WMF in case they have other ideas for you.

Merge Request submitted for the first part, migrating to the production package: https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/12

For the second half of this ticket, releasing an image for each of 6.5.4, 6.8.23, and 7.10.2 i wonder what the best approach to versioning these images is:

  • Current strategy, increment an independant version, currently at 0.1.0. The versioning implies we will follow semver, although it's also currently at 0.1.0 and semver doesn't imply much about compat in 0.x.
  • Upstream strategy, version the image based on whats included. We would need to include the plugins version as well, giving something like 6.5.4-7 as a version number. But then if the container building needs to change without a matching plugin package release we need a third number, giving 6.5.4-7-1 or some such.
  • Some other scheme?

In terms of discoverability and ease of use, i think using the upstream version number has a lot of benefits. It's a bit awkward though, is it worth it?

I would leave the build version of the plugins out and increment the image build itself for either plugin version upgrades or image related changes: 6.5.4-1 or 6.5.4-s1 as @kostajh suggests. To retrieve what version of the plugins are used one would have to read the changelogs.
Regarding the name of the image itself I wonder if we should not change it, it's currently stretch-elasticsearch (using stretch while I believe it's based on alpine). Should we be more specific about the usecase of this image (mainly used for CirrusSearch) with cirrus-elasticsearch or elasticsearch-cirrus?

This all sounds reasonable to me. I've updated the current patch to rename the image to cirrus-elasticsearch and set the version number to 6.5.4-s0. I made patches for the 6.8 and 7.10 images, but my current understanding of gitlab is these will have to be submitted as merge requests one at a time once the previous is merged and deployed.

Who should we poke about the merge/deployment part? I'm not entirely certain, but it looks like we need someone in the contint-docker admin group which is mostly releng.

Pinging releng for help on how to proceed with the gitlab MR and the deployment of the images to the docker repo.

Mentioned in SAL (#wikimedia-releng) [2022-04-12T21:37:09Z] <brennen> Updating dev-images docker-pkg files on primary contint for apache & elasticsearch changes (T304290, T305143)

thcipriani added a subscriber: thcipriani.

Mentioned in SAL (#wikimedia-releng) [2022-04-12T21:37:09Z] <brennen> Updating dev-images docker-pkg files on primary contint for apache & elasticsearch changes (T304290, T305143)

Images published, let us know if anything else is needed from our side! Thanks!