Page MenuHomePhabricator

Searching on Special:Search and MediaSearch on Commons returns error
Closed, ResolvedPublicBUG REPORT

Description

List of steps to reproduce (step by step, including full links if applicable):

  • Search anything in Special:MediaSearch

What happens?:
"Invalid search

Enter a new search above and try again"

What should have happened instead?:
Query returns results

Software version (if not a Wikimedia wiki), browser information, screenshots, other information, etc:


List of steps to reproduce (step by step, including full links if applicable):

Go to https://commons.wikimedia.org/w/index.php?search=London&title=Special%3ASearch&profile=advanced&fulltext=1&ns6=1

What happens?:
"An error has occurred while searching: We could not complete your search due to a temporary problem. Please try again later. "

What should have happened instead?:
Query returns files

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Dylsss updated the task description. (Show Details)

Note, it seems searching for files on Commons is currently impossible. I've created T295480 for that.

CBogen renamed this task from MediaSearch returns "Invalid search" for any query to Searching on Special:Search and MediaSearch on Commons returns error.Wed, Nov 10, 3:38 PM
CBogen added projects: Discovery-Search, Commons.
CBogen updated the task description. (Show Details)
CBogen added a subscriber: Ladsgroup.
Ladsgroup triaged this task as Unbreak Now! priority.Wed, Nov 10, 3:45 PM

It looks UBN!

Notes from irc:

  • commonwiki_file is empty and has no alias
  • moving traffic to the backup cluster
dcausse lowered the priority of this task from Unbreak Now! to High.Wed, Nov 10, 4:37 PM
dcausse added subscribers: EBernhardson, dcausse.

@EBernhardson switched traffic to codfw. Search is functional again. Lowering to high, we will have to recover the data before switching bach again to eqiad.

After doing some testing, I have a rough recovery plan:

  1. Deploy elasticsearch-repository-swift plugin to eqiad and codfw clusters
  2. Configure both clusters to connect to ms-fe.svc.eqiad.wmnet (swift)
  3. Snapshot the existing commonswiki_file index from the codfw cluster to swift, take note of start time
  4. Restore the snapshot from swift to the eqiad cluster.
  5. Run CirrusSearch downtime catchup procedure against eqiad for the period between starting restore and the cluster no longer failing writes to the commonswiki index.

Some related notes:

  • elasticsearch-repository-swift was never released for 6.5.4, I ended up taking the last commit targeting 6.6.0 and compiling it against 6.5.4 (change elasticsearchVersion = 6.5.4, and change gradle from 5 to 4.1). What process should we follow to include this in the plugins .deb since we are no longer the upstream here?
  • Should we have a separate auth setup in swift for cirrussearch snapshots?
  • By default snapshot backup/restore is limited to 20MB/s per partition. Since commonswiki is 32 partitions the cluster will limit itself to 640MB/s, or over 5 gigabits/s. I suspect this is a bit excessive for the swift cluster, or at least beyond doubling the typical network traffic. What would a more appropriate limit be? @fgiunchedi
  • After or during restore of the snapshot we likely need to manually assign the commonswiki_file and commonswiki aliases to it.

Closing this ticket, as this specific problem has been resolved. New ticket for remaining work: T295705

Mentioned in SAL (#wikimedia-operations) [2021-11-23T17:35:41Z] <ebernhardson> T295478 start snapshot of commonswiki_file from cirrus codfw -> swift eqiad