Rolling operation cookbook: Detect and remove failed index aliases
Open, MediumPublic
Actions

Assigned To

None

Authored By

	bking
	Sep 1 2023, 4:27 PM

Description

Failed reindexes are fairly common in our Elastic environment. While they're not cause for alarm, they do cause our clusters to dip into red status during routine maintenance operations, such as restarts or reboots.

Our rolling-operation cookbook stops when it detects the cluster is red (which is good!) but it requires manual intervention to clean up the failed indices. The cirrussearch extension repo already has a Python script that detects the failed duplicate indices, so let's make use of this into the rolling-operation cookbook.

AC:

Rolling operation cookbook detects failed duplicate indices before maintenance operation and prompts user to delete them.

Event Timeline

bking created this task.Sep 1 2023, 4:27 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 1 2023, 4:27 PM

bking renamed this task from rolling operation: Detect and remove failed index aliases to Rolling operation cookbook: Detect and remove failed index aliases.Sep 1 2023, 4:28 PM

Gehel removed a project: Data-Platform-SRE.Sep 4 2023, 3:20 PM

Gehel edited projects, added Data-Platform-SRE; removed Discovery-Search.

Gehel triaged this task as Medium priority.Sep 6 2023, 8:33 AM

Gehel moved this task from Incoming to Ready for Work on the Data-Platform-SRE board.

Gehel moved this task from Ready for Work to Misc on the Data-Platform-SRE board.

Gehel moved this task from Misc to Toil / Automation on the Data-Platform-SRE board.Dec 6 2023, 1:29 PM

RKemper updated the task description. (Show Details)Dec 13 2023, 10:50 PM

Rolling operation cookbook: Detect and remove failed index aliasesOpen, MediumPublicActions

Description

Event Timeline

Rolling operation cookbook: Detect and remove failed index aliases
Open, MediumPublic
Actions