Maniphest T219364

Elasticsearch indices went read-only causing huge lag
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Addshore
	Mar 27 2019, 10:58 AM

Description

Reported at https://www.wikidata.org/wiki/Wikidata:Project_chat#Severe_problems_editing_Wikidata

https://grafana.wikimedia.org/d/000000400/jobqueue-eventbus?orgId=1&panelId=5&fullscreen&from=now-2d&to=now shows the jobs have not been running since yesterday? or at least not as fast?

<•dcausse> hu wikidatawiki index is readonly...
10:53 AM update: /wikidatawiki_content_1537536135/page/60100428 caused blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];

Affected wikis: P8289
Time: 2019-03-27 from 07h40 to 11h20 UTC

Related Objects

Mentioned In: T219799: Create cookbook to reset readonly indices on elasticsearch clusters
T219452: Cannot add a Wikidata sitelink [2019-03-27]
T219366: Consider using CombinedEntitySearchHelper with EntitySearchElastic and EntityIdSearchHelper for wikidata.org
T219365: Create alarm for lag of wikidata search index
Mentioned Here: P8289 wikis whose index went read-only
T190022: Separate the CirrusSearch/Elastic-specific code from Wikibase code base
T194199: [Epic] Prepare for Elasticsearch 6 upgrade

Event Timeline

Addshore created this task.Mar 27 2019, 10:58 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 27 2019, 10:58 AM

Addshore triaged this task as Unbreak Now! priority.Mar 27 2019, 10:58 AM

Restricted Application added subscribers: Liuxinyu970226, TerraCodes. · View Herald TranscriptMar 27 2019, 10:58 AM

Addshore added subscribers: dcausse, Gehel.Mar 27 2019, 10:59 AM

Could be related to the Wikibase(Lexeme)CirrusSearch extraction (T190022) or the ElasticSearch upgrade (T194199)?

From IRC:

<•dcausse> gehel: seems like there's a new settings in elastic
10:58 AM read_only_allow_delete is set to true when disk space goes low

It is being worked on :)

Mentioned in SAL (#wikimedia-operations) [2019-03-27T11:06:13Z] <dcausse> elasticsearch search cluster: setting "index.blocks.read_only_allow_delete" to null on all indices in omega/psi/chi@omega (T219364)

Addshore mentioned this in T219365: Create alarm for lag of wikidata search index.Mar 27 2019, 11:09 AM

Mentioned in SAL (#wikimedia-operations) [2019-03-27T11:10:21Z] <dcausse> elasticsearch search cluster: setting cluster.routing.allocation.disk.watermark.flood_stage to 100% on omega/psi/chi@eqiad (T219364)

Addshore mentioned this in T219366: Consider using CombinedEntitySearchHelper with EntitySearchElastic and EntityIdSearchHelper for wikidata.org.Mar 27 2019, 11:15 AM

The backlog of updates is being processed, once we catch up on these updates we will run a maint script to reindex lost updates.
Lowering to High as the immediate actions were taken, it now may take few days to fully sync the index and the database for the affected wikis.

Restricted Application edited projects, added Discovery-Search; removed Discovery-Search (Current work). · View Herald TranscriptMar 27 2019, 1:35 PM

dcausse renamed this task from Wikidata search lagging behind to Elasticsearch indices went read-only causing huge lag.Mar 27 2019, 2:01 PM

dcausse updated the task description. (Show Details)

dcausse edited projects, added Discovery-Search (Current work); removed Discovery-Search.Mar 27 2019, 2:04 PM

MarcoAurelio mentioned this in T219452: Cannot add a Wikidata sitelink [2019-03-27].Mar 27 2019, 9:44 PM

• Mholloway subscribed.Mar 28 2019, 4:09 AM

Backlog of updates is now completely absorbed, a script has been run to catchup lost updates, nothing we can do at this point except waiting for the maint script to stop, moving to done.

Gehel mentioned this in T219799: Create cookbook to reset readonly indices on elasticsearch clusters.Apr 1 2019, 3:48 PM

debt closed this task as Resolved.Apr 5 2019, 10:43 PM

Elasticsearch indices went read-only causing huge lagClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

Elasticsearch indices went read-only causing huge lag
Closed, ResolvedPublic
Actions