Page MenuHomePhabricator

⬆️🗣️Default language code (?) broke search
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

On https://memory-prime.wikibase.cloud/ type 'gashtor' in the search bar.
The search bar does not provide suggestions, even though 'Gashtor' item exists.
The search bar provides as a suggestion when searched with capital G.

Same case-sensitive behavior for when creating a statement: for example, entering 'wiki' in the property search bar doesn't suggest the 'Wikidata ID', but 'Wiki' does.

On https://anton12.wikibase.cloud/ when entering 'my city' into the search bar, it only offers the items where 'my city' is in the English label, the mul labels are ignored.


How search currently works:
Here are some notes on what the team remembers about how Elasticsearch works on Wikibase Cloud.

  • MediaWiki defines some ES field mapping (a bit like a DB schema)
  • We don't really care about the ES field mapping, however, we have to due to our hack of using shared indexes
  • All wikis use the same field mapping which means we can't update the field mapping one wiki at a time
  • CirrusSearch creates the first field mapping that all wikis alias to

Dev Notes:
tl;dr: Elasticsearch doesn't work for mul labels and aliases

  • The ES field mapping need to be replaced with a newer version for searching on mul labels and aliases to work.
    • CirrusSerach (or Elastica) knows what this field mapping should be via extension points that extensions like WikibaseCirrusSearch can hook into to define what can be searched.
    • There are jobs to create this field mapping that we can manually trigger (see https://github.com/wmde/wbaas-deploy/blob/main/doc/search.md).
  • Once we have updated the field mapping we will need to re-index all the Wikis.
  • We could version the indexes and write to both new and old indexes until we have backfilled all data for all wikis into the new indexes.
    • This will require us modifying our WBC aliasing logic
  • forceSearchIndexFromTo.yaml is a k8s job that runs the CirrusSearch/maintenance/ForceSearchIndex.php MediaWiki job.
  • elasticSearchInitJob.yaml is a k8s job that runs the CirrusSearch/maintenance/UpdateSearchIndexConfig.php MediaWiki job.
    • this is possibly outdated
    • doing this in a k8s job is "nicer" in that it won't get terminated if the mediawiki deployment is and the logs are easier to access
  • ApiWbStackElasticSearchInit.php is an API module that runs CirrusSearch/maintenance/UpdateSearchIndexConfig.php
    • this was last modified more recently than the elasticSearchInitJob.yaml is a k8s job
    • this runs in a mediawiki pod rather than as a k8s job (which means it will also work in the docker env where k8s jobs can't run)
  • Q: how long would it take to "just" re-index everything?
    • We tried searching for any info from the last time we did an ES re-index but couldn't find any useful durations.
    • Tom's guess is a week at most; depends on amount of data and speed of machines etc.
    • The engineers would like to avoid creating any extra pressure on ourselves by "just" re-indexing. We would prefer to version the indexes.
  • Q: should we spin up a separate ES cluster with the new index so that we don't need to update our existing MediaWiki jobs for (re-)indexing?
    • we wouldn't be able to benefit from a partial re-index
    • we think we will have to do a re-index regardless
    • we haven't thought about having different prefixes in the same cluster before
  • Q: if we spin up a separate ES cluster should we also move to OpenSearch?
    • if it doesn't add too much complexity (i.e. don't do it if it requires major effort)

Task Breakdown

(Most of these steps need to be done in order; Step 3 can be done before steps 1 and 2)

  1. T416155: 🗣️Create new ES cluster in all k8s environments with the name elasticsearch-3 using existing helm chart and container images (updating to new version of Elasticsearch or OpenSearch is out of scope)
    • we might need to temporarily add more resources to staging and/or production to fit this new ES cluster
      • our existing cluster requests 3 instances of master nodes @ 15m CPU and 8Gi RAM and 2 instances of data nodes (replicas) @ 100m cpu and 18Gi RAM (see production/elasticsearch-2.values.yaml.gotmpl); suggest we use the same for the new cluster
    • on the borderline if we need more resources; we decided to spin up another two nodes for this migration
    • two PRs per change, one for staging+local and one for production
  2. T416156: 🗣️Create shared indexes on new cluster using elasticSearchInitJob.sh following the instructions from search.md#shared-index-creation
    • Make sure that the MW_WRITE_ONLY_ELASTICSEARCH_HOST is the correct ES host
    • Make sure that the CLUSTER_NAME is set to write-only
  3. T416157: 🗣️Update the Platform API so it also creates aliases for the shared indexes on new cluster for newly created Wikis. This functionality doesn't yet exist and will need to be added.
    • When we move to Opensearch we also want this functinality.
    • the ElasticSearchAliasInit Laravel job needs to take a Job Parameter (see ElasticSearchAliasInit.php#L20-L21) to specify the ES cluster host (domain name and the port but not the scheme e.g. elasticsearch-2.default.svc.cluster.local:9200)
      • the new Job Parameter should be the 2nd required parameter
      • existing calls to this Job will need to be updated to specify this new job parameter
    • In the WikiController we need to call ElasticSearchAliasInit twice
    • Use the elasticsearch_hosts variable and remove elasticsearch_cluster_without_shared_index and elasticsearch_shared_index_host https://github.com/wbstack/api/blob/b790ee100e11d78984a9b6d1f02b8377f0ce8a54/config/wbstack.php#L22
  4. T416158: 🗣️Create aliases for the shared indexes on new cluster for all existing wikis using the ElasticSearchAliasInit Laravel job - see search.md#manually-b
  5. T416177: 🗣️Set the new cluster to be written to by MediaWiki by setting writeOnlyElasticsearch.host - see chart value
  6. T416178: 🗣️Re-index all the existing data into the new Elasticsearch cluster using the forceSearchIndexFromTo.sh script for each wikis. Note: the script defaults to running against all cluster - we should specify ONLY the new cluster.
    • Place the domains in a file and iterate over them in a shell script loop.
  7. T416181: 🗣️Set the new cluster to be read/writeable in MediaWiki and the old cluster to write-only Swap the setting so that writeOnlyElasticsearch.host points to elasticsearch-2 and elasticsearch.host points to elasticsearch-3.
  8. T416182: 🗣️Decommission `elasticsearch-2` Test everything works, remove any settings referencing elasticsearch-2 and finally remove elasticsearch-2 cluster

Level of Effort (t-shirt size): Large


Once we have resolved the bug, we should make sure we have documented how we have configured Elasticsearch and how to update indexes while it is still fresh in our heads.

Related Objects

Event Timeline

I think the reason why it's not finding this particular item is that the CirrusSearch index needs to be recreated to take into account the new mul language field. The problem is less visible during fulltext search as the term John Doe II is present in the main text field but this kind of search might still be affected because matches on the mul field might not be properly boosted.

At first glance, it looks like we'll need to re-index all of our sites.

Tarrow changed the task status from Open to Stalled.Fri, Jan 23, 3:37 PM
Tarrow subscribed.

seems likely. I scheduled a 30min sync for us to talk about how we want to approach this task on Monday.

Ollie.Shotton_WMDE renamed this task from ⬆️Default language code (?) broke search to ⬆️🗣️Default language code (?) broke search.Mon, Feb 2, 11:31 AM