Page MenuHomePhabricator

Discuss production Elasticsearch needs with the Search team
Closed, ResolvedPublic

Description

The Search team maintains Elasticsearch clusters for use by MediaWiki in the Wikimedia production environment. Toolhub's es needs are expected to be very small in comparison to the shard/index needs of the Wikimedia production wikis. Ideally we can negotiate space in their clusters for the index that Toolhub will need to perform faceted search actions.

Event Timeline

bd808 triaged this task as Medium priority.Jan 6 2021, 10:55 PM
bd808 moved this task from Backlog to Research needed on the Toolhub board.

@srishakatux and I met briefly today with @Gehel, @CBogen, and @MPhamWMF who were interested in hearing a bit more about our faceted search implementation.

We also talked very briefly about the production needs for Toolhub's elasticsearch index. @Gehel mentioned that they have been thinking about splitting out a separate cluster for things like Toolhub and Phabricator that have elasticsearch needs with an intent on simplifying the maintenance of the cirrussearch indexes by separating them from other mixed use services. We need to discuss this more and figure out who/what/when details.

@Gehel and I had another chat about the high level needs on 2021-05-12. This was prompted by my addition of Search as a cross team dependency for Toolhub in the Foundation's fiscal year 2021/2022 planning process. We refreshed each others knowledge about things. A more detailed technical sync will still be needed prior to launch.

@Gehel it is time for that "more detailed technical sync prior to launch" I think. There are still lots of little things I'm trying to button up, but our fingers crossed hope is that we will be able to deploy into prod in about two weeks.

My (incomplete, but hopefully helpful) notes from a 2021-08-04 meeting with @Gehel and @RKemper:

  • No write controls for cluster from inside prod network, so no special accounts needed
  • Document this new snowflake index so search remembers about it
  • No cross-dc replication for es clusters
    • The app will need to either write to both or to rebuild in off-dc before cutover
  • Ok to ignore the stoppable writes description, but app will need to deal with failed writes during maintenance
  • Name the index toolhub-<something> so that it will be easier to understand where the index comes from in the future

Change 710110 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[wikimedia/toolhub@main] search: rename toolinfo search index

https://gerrit.wikimedia.org/r/710110

Change 710110 merged by jenkins-bot:

[wikimedia/toolhub@main] search: rename toolinfo search index

https://gerrit.wikimedia.org/r/710110

I created initial empty indices in eqiad and codfw from mwmaint1002:

$ curl -X PUT -H 'Content-Type: application/json' -d @toolhub_tools.index.json search.svc.codfw.wmnet:9200/toolhub_tools
$ curl -X PUT -H 'Content-Type: application/json' -d @toolhub_tools.index.json search.svc.eqiad.wmnet:9200/toolhub_tools

I'm now wondering if I did this correctly by sending to port 9200. I actually need the indices to be in the "chi" clusters. I think the actions I took could make them land on any of the 3 clusters. :/

I'm now wondering if I did this correctly by sending to port 9200. I actually need the indices to be in the "chi" clusters. I think the actions I took could make them land on any of the 3 clusters. :/

After digging around in Puppet, I think I did the right thing. hieradata/role/eqiad/elasticsearch/cirrus.yaml shows that chi == port 9200, omega == port 9400, and psi = port 9600. I didn't see this obviously documented on wikitech.

Indexing and search are working from inside the eqiad Kubernetes cluster!