Sister search / Cross-language search interaction with multicluster
Closed, ResolvedPublic

Description

  • Search config needs to know the name of the cluster to connect to. Currently it only has the local wiki config and not the config of the wiki being searched
  • Connection class needs to report the name of the cluster it's connected to: Connection::getClusterName
  • Connection might need different methods to get read / write connections?
  • Which cluster should link counting go to? Might not matter
  • New LVS endpoints: write lvs for eqiad/codfw, and a top level auto-dicovery endpoint
  • APIFeatureUsage needs to connect to the appropriate cluster: Always the large cluster.
  • CirrusSearch needs some config that enables cross-cluster prefixing. When enabled all searches (local, sistersearch, second try, etc) must use be prefixed with their cluster name. This should be doable by having either function params or separate functions to get read and write index names.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 26 2018, 4:51 PM
EBjune triaged this task as Normal priority.Sep 27 2018, 5:03 PM
EBjune moved this task from Needs triage to Up Next on the Discovery-Search board.

Search config needs to know the name of the cluster to connect to. Currently it only has the local wiki config and not the config of the wiki being searched

TODO

Connection class needs to report the name of the cluster it's connected to: Connection::getClusterName

I did some testing and the assumption behind this is incorrect. This was assuming we had to decide which indices are remote are prefix them, while knowing which are local and leave them unprefixed. In some testing (and in the documentation) you can configure the cross cluster service with the internode transport of it's own cluster and it works fine. Meaning we can simply prefix all indices with their cluster name and not worry about this.

Also the connection class already reports this.

Connection might need different methods to get read / write connections?

The only place we might write to an index other than the local wiki's index is for OtherIndex updates. Since this is a singlular use case it seems sane to do something very specific to solve the problem rather than something generic like read/write splits on connections.

Which cluster should link counting go to? Might not matter

I don't remember why we were thinking about this. In general sending the link counting query to the default search cluster seems perfectly sane, and what it already does today.

New LVS endpoints: write lvs for eqiad/codfw, and a top level auto-dicovery endpoint

TODO

APIFeatureUsage needs to connect to the appropriate cluster: Always the large cluster.

As long as we keep the existing cluster on 9200/9243 (and i don't see why we would do otherwise) nothing should need to change here.

EBernhardson updated the task description. (Show Details)Oct 1 2018, 9:41 PM
EBernhardson updated the task description. (Show Details)Oct 1 2018, 9:44 PM
EBernhardson moved this task from Up Next to Current work on the Discovery-Search board.
EBernhardson removed a project: Epic.

Change 464016 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[mediawiki/extensions/CirrusSearch@master] [WIP] Implement multi cluster/multi dc configuration

https://gerrit.wikimedia.org/r/464016

EBernhardson added a comment.EditedOct 11 2018, 11:24 PM

Problems outline in the description are detailed in docs/multi_cluster.txt in the patch. As far as I'm aware this patch resolves all of the machinery necessary in CirrusSearch to deploy multi cluster.

Operationally we will still need to get the new LVS endpoints setup, and the clusters stood up on the severs.

EBernhardson added a comment.EditedOct 11 2018, 11:58 PM

Proposed deployment process:

Cluster names of alpha (current cluster) and beta, gamma (two new clusters).

Initial Cirrus config.

  • This will continue to read/update from the alpha cluster, but also start mirroring to the new clusters
  • To prevent logspam the metastore should be created and all writes to the cluster should be frozen before deploying CirrusSearch config. Note that metastore.php will accept any configured cluster via --cluster, not only those available for writes.
  • This all needs to be duplicated for eqiad and codfw
  • This creates/populates all the desired indices in one go instead of staged in groups, but still stages rollout of any usage as part of a user request.
  • As long as wikis have ReplicaCluster defaulted to alpha cross-wiki search should work for searches to eqiad or eqiad-temp.
wgCirrusSearchWriteClusters = [ eqiad, codfw, eqiad-temp, codfw-temp ]
wgCirrusSearchClusters:
  eqiad-alpha:
    replica: eqiad
    group: alpha
  eqiad-alpha-temp:
    replica: eqiad-temp
    group: alpha
  eqiad-beta:
    replica: eqiad-temp
    group: beta
  eqiad-gamma:
    replica: eqiad-temp
    group: gamma
  ...

We can run the standard create/populate scripts in CirrusSearch against eqiad-temp and codfw-temp. Once all desired wiki's have all been populated in the eqiad-temp cluster we can change wgCirrusSearchDefaultCluster for one or more wikis in whatever staging seems comfortable over to it. Once all wikis have migrated we can remove eqiad-alpha-temp, merge the clusters into a single replica per dc and change DefaultCluster back to the local dc everywhere. Once they are merged the old indices in alpha can be deleted. We likely need to come up with a script for cleaning up alpha to ensure we don't accidentaly delete the wrong thing.

Change 464016 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Implement multi cluster/multi dc configuration

https://gerrit.wikimedia.org/r/464016

debt closed this task as Resolved.Nov 2 2018, 9:59 PM